Greg Freemyer wrote:
On 11/9/06, Stan Glasoe <srglasoe@comcast.net> wrote:
On Thursday November 9 2006 9:52 am, Carlos E. R. wrote:
The Thursday 2006-11-09 at 08:30 -0600, Stan Glasoe wrote:
Not. RAID5 has to first write the data, then it has to calculate the parity, then it writes the parity. It can't do all three in parallel. It has to actually write the data portion so it has the necessary information to calculate the parity and then write the parity.
I don't think that is correct.
So says the HOWTO:
Both read and write performance usually increase, but can be hard to predict how much. Reads are similar to RAID-0 reads, writes can be either rather expensive (requiring read-in prior to write, in order to be able to calculate the correct parity information), or similar to RAID-1 writes. The write efficiency depends heavily on the amount of memory in the machine, and the usage pattern of the array. Heavily scattered writes are bound to be more expensive.
Carlos E. R.
The How To in this case gives incomplete information. If there is a write at all on RAID5 then it isn't similar to RAID1. All the calculations may be done in system or controller card memory but as far as benchmark software timing is concerned RAID5 still has to make 2 writes versus RAID1 making 1 write before that I/O is considered done.
RAID1 can write to one disk and then when time and work load permits, mirror that write to the other drive. Both drives typically aren't written to at the same time. So once the data is committed to either disk the benchmark doesn't track the other I/O but may be affected since that background I/O may delay a read or write that is next in queue. That background write probably doesn't interfere with the next I/O since if that drive is busy then the other drive most likely isn't and the next I/O the benchmark is counting goes to the available drive; not always but often enough.
RAID5 writes have to calculate and write the changed parity. That adds I/O delay and it is always more than RAID1. A RAID5 write has to write a data chunk, read the other data chunk, calculate parity and then write the parity chunk. Now the I/O is committed to disk and the benchmark can call it complete. 2 writes are involved; changed data and changed parity.
Hence my opinion that RAID5 writes always suck in the performance area. They do, they can't help it. Depending on the system, is that performance hit of any concern versus the survivability of the data?
Stan
The calculation is not normally done that way. To compare different RAID efficiencies the standard process is to calculate or compare the number of logical writes/second the array will sustain under heavy load and a well utilized cache to minimize wasted work.
IMHO, this should not be stated in absolute terms. E.g., on typical database servers, one uses almost always unbuffered (synchronous) writes. Only the DBMS caches and the filesystem cache is turned off. Because there, reliability (the D in ACID) is much more important than any performance improvement. Cache is a not a factor in disk benchmarking then, the beef is in fast synchronous writes when one selects a disk subsystem. The DBMS cache efficiency will be the same for all RAID variants, the log tablespace performance limit the overall write efficiency. This is one of the reasons why one does not use plain IDE or SATA disks without battery buffers for such systems, but real SCSI disks where one can control such capabilities in detail. E.g., one typically selects fs sync flags differently for the temp and the log tablespaces. (Btw, I prefer EMC storage subsystems for their reliability and their feature set. This is not a plea for Linux md.) In real storage selection scenarios, performance benchmarking depends highly on the application that this server is used on. I do this for a living, and at the situations where I'm involved (typically mission-critical servers) RAID-5 is more and more uninteresting. Current storage prices make RAID-1 practical for TB-sized storage systems; only at the PB-level we need different scenarios (HSMs, actually). Joachim -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Joachim Schrod Email: jschrod@acm.org Roedermark, Germany