On 07/02/2014 11:45 PM, Greg Freemyer wrote:
On July 2, 2014 2:01:08 PM EDT, Anton Aylward
wrote: On 07/02/2014 12:58 PM, Greg Freemyer wrote: <snip>
But when I first copied the data between 2 drives I was only getting about 60 MB /sec throughput (120 MB combined reads and writes).
Suppose you use the 'copy' program from software bytes using the Fread() type buffering. You have a 512 byte read buffer and 512 byte write buffer. The core of you code does putchar(getchar()) that is, one at a time. And don't forget those 512 byte buffers.
Then there's using 'dd' with big buffers between files. Oh, right, files, which means the file system overhead, and quite possibly the allocation of new file segments and putting those refernces in the file map.
For my main use case I'm using ewfacquire as the user space app. It can compress the data stream, but I find the optimum clock time for the whole copy is with no compression.
So, compression is slow. That may be the algorithm or your machine. Metricating that is another issue!
Ewfacquire works more like dd with tunable block sizes. I was testing with 32 KB blocks as I recall (I use the default transfer size).
You should metricate with that as well when you get the FS issue sorted out.
I will try even bigger blocks, but my testing with sata drives and dd showed 4KB blocks were only slightly slower than 1MB blocks.
If your file system is 4K based and you are not writing to some preallocated file then I'd expect to see something like that.
Some file systems are faster than others.
I hate to admit that for the data transfer scenario I'm writing to ntfs formatted drives, so ntfs-3g is getting a major workout.
ROTFLMAO!
I will test with ext4, btrfs and xfs just to see if the filesystem is a major bottleneck.
All those are tunable as to the underlying block size. They also have journals and journal block size, which can affect performance. Some people have got good results putting the journal for a slower rotating drive on a small SSD, back in the days when SSDs were small :-) Still, it says something about journalling. Oh, and there is also hashing algorithms to consider. Sometimes my btrfs freezes. Well actually the table algorithm does a rebuild and load average goes up to around then, though as high as 15-18 isn't uncommon and I saw 27 once. At least that's what I think is happening form the unresponsive 'ps' and 'top' and 'iotop'.
I had to ewfacquire about 50 drives a couple weeks ago, and the throughput really was a bottleneck in me getting through the project.
Trouble is I had to ship the data (forensic images) to my client and ntfs or fat32 are the only 2 realistic filesystems for the delivery.
Looks like you have a set of handcuffs there. Sideline: I bought a 64G microSD for my tablet and a full size SD carrier so I could plug it into my desktop to 'copy books and music. Only my desktop refused to see it. It turns out large cards use extFAT. Linux doesn't have a driver for that, but I found a FUSE one. You might consider if your clients can use something like that. -- /"\ \ / ASCII Ribbon Campaign X Against HTML Mail / \ -- A lot of managers talk about 'thinking out of the box,' but they don't understand the communication process by which that happens. You do not think out of the box by commanding the box! You think out of the box precisely by bringing ideas together that don't allow dominant ideas to continue to dominate. -- Stan Deetz -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org