[opensuse] Disk i/o's needed for raid 5 [WAS: duplicating current drive to new drive via usb and dd?]

9 Jul 2014

      On Wed, Jul 9, 2014 at 2:42 PM, Carlos E. R.
<robin.listas@telefonica.net> wrote:
...
On 2014-07-09 20:00, Greg Freemyer wrote:
...
fyi: Raid 5 and 6 have the same issue, but the most efficient write
size is the size of a full raid stripe.  XFS as an example will try to
structure it's writes to be full stripes when working with Raid.  Thus
a full stripe write becomes: write all data and parity chunks.
A partial stripe write becomes:
- read data about to be overwritten,
- read old parity info.
- Calculate new parity info.  -
- Write new data,
- write new parity info
Thus a single partial stripe write to a raid 5 requires at least 4 i/o
operations.  For a raid 6, it is a minimum of 5 i/o's.  Having the
filesystem invoke properly aligned full stripe writes is significantly
more efficient.
Doh!  So that's why!
And you mean that number of i/o per disk on the raid? No, that can not
be. It has to be one read per disk, that is, 3 reads (which should
happen simultaneously). Calculate parity. Do 3 writes, one per disk
(which should also happen simultaneously).
Mmm... it does not match what you say, so I must be getting it wrong :-?
To understand this we have to dig into some mathematical theory, so
forget about reads and writes for a second and just focus on the math.

== Beware: math below ==

Lets talk about a raid 5 with 5 disks with one of the stripes laid out as:

D1, D2, D3 ,D4, P

By definition P = D1 ^ D2 ^ D3 ^ D4   (that's just how raid 5 works)

^ is the xor operator as defined in the c programming language but
applied to an entire stride's worth of bytes.

If I want to change the data on D2, then I can back it out of the
calculation by:

P ^ D2 = (D1 ^ D2 ^ D3 ^ D4) ^ D2

because of the way ^ works that can be simplified to

P ^ D2 = D1 ^ D3 ^ D4  (ie. the D2 xor operations effectively cancel
themselves out).

Now if I call the new D2 data D2n I can write a new equation as:

(P ^ D2) ^ D2n = (D1 ^ D3 ^ D4) ^ D2n

or by simple math

P ^ D2 ^ D2n = D1 ^ D2n ^ D3 ^ D4

====
Alright time to talk about disks.

We know before updating D2, this is true:

P = D1 ^ D2 ^ D3 ^ D4

And after updating D2 this must be true:

Pn = D1 ^ D2n ^ D3 ^ D4

The obvious approach is to read D1, D3 and D4 calculate Pn.

That means 3 reads and 2 writes or 5 i/o operations.

Put lets do some math to that last equation:

Pn = D1 ^ D2n ^ D3 ^ D4
Pn = (D1 ^ D3 ^ D4) ^ D2n

But remember from the earlier math we know:

P ^ D2 = D1 ^ D3 ^ D4

So let's replace (D1 ^ D3 ^ D4) with P ^ D2, we know have:

Pn = (P ^ D2) ^ D2n

Note that only requires 2 of the old data values.

So what raid system does is it reads the original P stride and the
original D2 stride.  Then it xor's them together to remove the
influence of the old D2 value.  Then it xor's in the new D2n stride to
calculate the new Pn.

The end result is 2 simultaneous reads (P and D2) followed by 2
simultaneous writes (Pn and D2n).

The cool part about that is works regardless of how many disks are in
the raid 5 array.  A single data stride update always requires exactly
2 reads and 2 writes.

=====
I don't actually know how raid 6 works, so I can't do the same walk
thru, but my understanding is a single data stride update with raid 6
involves 3 reads (P1, P2, D2) and 3 writes (P1n, P2n, D2n).

The rest of the data strides don't have to be read to do the calculations.

=====

That was a fun exercise.  I hope at least a couple of people learned something.

Greg
-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse+owner@opensuse.org