Mailinglist Archive: opensuse (729 mails)

< Previous Next >
[opensuse] Re: raid use case
greg.freemyer@xxxxxxxxx wrote:

Linda, I simply think it is obvious raid6 is inherently safer than raid10.

== gory details ==

For the the situation without background data scrubbing:

let's say there is a "x" chance that a drive has undetected bad sectors.
===
At this point, why are we discussing that? I said my
raid card does background scanning automatically, once a week (you can change it to a longer time, or you can have it do continuous scanning
as well, as well as have it limit itself to 'X'% of the disk BandWidth).
Someone mentioned that the mdraid w/linux has an option
for doing the same -- so that doesn't appear to be at issue.

What is at issue is whether or not it is safer to use RAID6 instead of RAID10+daily backups.
Note: (I scanned about 20 disks out of laptops about 3 weeks ago. One of them
had a bad sector. 1 in 20 drives having an untapped bad sector sounds about
right to me based on personal experience. My job calls for doing a sector by
sector read of my clients drives, so I'm often the person that finds undetected
media errors.)
---
Nearly all times I've run into bad sectors have been when I've
used Desktop drives. Many Desktop drives are Enterprise drives w/remapped
sectors. I.e. the vendor's tests indicated some need for sector remapping
beyond "some" quality threshold.

My own experience w/failures: more than once they were they
were desktop drives. About 15 years ago, I didn't realize how
much difference there was between desktop and enterprise level drives
and made the mistake of ordering a batch for use w/a HW raid card.
None of them would work -- because all of them had remapped sectors.

A second data point was on an order of 26 drives. They
were Enterprise drives, but the vendor didn't list or say that they were also 'pre-owned/remanufactured' drives. The LSI-HW raid card labeled
23 of them 'Bad' -- even though they were all Enterprise. A bit
of research turned up that their original manufacture-date was over 3 years old. Even though they drives come w/5-year warranty,
the OEM had(has) a registry of remanufactured drives and won't honor
drives that have been 'remanufactured'.

Of special note: I scanned the drives that were rejected.
All came up "error-free", but 23 of them had too many remapped
sectors that the HW raid card detects as soon as the drive spins
up (i.e. it has to be using #defects -- insufficient time for it
to have scanned the drive).

In my testing I've found that drives with remappings, take longer to scan than drives w/no mappings -- the worst drives
took 15% longer to complete a scan than those that didn't.

Background scanning that *rewrites* the sector to the same
physical media is _potentially_ hiding an issue that Google's
statistical data shows has a significantly higher failure chance in
the next month.

So if one member of a mirror dies, then there is x chance that the remaining
member has at least one bad sector. At least as of a few years ago, mdraid
would abort a mirror re-build as soon as it hit that bad sector.

I think for most of us we would agree that in the course of 10 years the odds
of at least one drive failure in a mirror pair is effectively 100%, so the odds
of a full raid failure are at least x, where I claim x is about 1 in 20.
----
It is recommended that consumer drives be replaced every 3 years, and enterprise in the 4-5 year time frame. If you are still using a 10 year-
old drive, might as well play russian roulette. That's scary!

(yes there are ways force data loss for that bad sector and trigger a remap.
After that the rebuild should complete, but you still have at least one sector
of known data loss.)

With a 4 drive raid 6, you need the exact same sector on 2 of the 3 remaining
drives to be bad.
---
So with a RAID10 (or RAID1), wouldn't you also need the exact same sector to be bad -- as well as in the backup image which is also on a RAID10?


If you issue a format command or dd if=/dev/zero of=RAID6, or if
you upgrade to a new OS, your machine may be unbootable. RAID6 won't
do you any good. My suggestion was, in reponse to the OP -- who was
going to have 1 new drive and how could he best use those two drives.
best to use the two drives. with 1 extra drive -- use it for
incremental backups.
Even if all 3 of the surviving member drives have a single bad sector, the odds
of it being the exact same sector are in the billion to one odds range.
---
Same would hold for 2 separate RAID10's where the 2nd is used
for incremental backups.
Then lets talk about performance.

In the data scrubbing case:

For this I will assume the drives are "perfect" and have no hidden bad sectors.

Assume the odds of the second drive of a mirror pair failing before a rebuild
can complete are y.

Thus the odds of a mirror pair totally failing is simply y (maybe one in
100,000).

For the raid 6 you need 2 additional drives to fail prior to the rebuild
completing. Thus the odds are on the order of y^2. (Maybe one in
10,000,000,000)

Thus the odds of a raid6 failure are on the order of 1 in a billion, but the
odds of a mirror failing are on the order of one in 100,000.
----
But the mirror is incrementally backed up on a 2nd RAID10.
All of a sudden you are talking having at least 4 copies of the data -- all would have to fail. In my use case, I'd have to have 3 additional
drives fail before rebuild is complete. How could that not be safer --
protected against soft-attacks that corrupt the data AND HW failures.

I've used RAID6 and the performance just wasn't there. But
aside from that benefit, I can restore my system from any day in the past 2
weeks and from more spaced apart images going back 3 months right now
(Feb 1 level 0 backups).


Another benefit of RAID10, say again for simplicity 4 data spindles,
RAID1 only needs to write a 2nd disk. RAID6 needs to *read* the 5 other
disks in the RAID6, and then do a write to at least 1 parity disk (maybe
both). That has to hurt performance....

Also, FWIW, if I got a new disk, the new (usually larger) disk went
to backups..






--
To unsubscribe, e-mail: opensuse+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse+owner@xxxxxxxxxxxx

< Previous Next >