Re: [opensuse] Failed RAID Please Help

24 Jul 2008

      ----- Original Message ----- 
From: "John Andersen" 
Cc: 
Sent: Wednesday, July 23, 2008 8:50 PM
Subject: Re: [opensuse] Failed RAID Please Help
...
On Wed, Jul 23, 2008 at 3:47 PM, Rodney Baker  
wrote:
...
On Thu, 24 Jul 2008 03:30:54 John Andersen wrote:
...
[...snip...]
If its raid0 you have bigger problems, about the same problems is
you had used LVM and skipped raid all together, but even given
the lack of redundancy, LVM makes more sense than raid0
in linux.  So I'm guessing no sane person would use raid0
just to concatenate drives in linux, and you probably don't have raid0.
Hmmm; last time I saw him my doctor said he thought I was still sane, yet 
I'm
using raid0 for exactly that purpose...
My previous experience with LVM was that it was a PITA to set up and then 
it
got corrupted due to  a power outage. As a result /home was completely
hosed :-(.
I learned from that - I won't use LVM again. /home is now on a raid1 
array,
with nightly backups to an external drive, and non-critical data (e.g. 
stuff
downloaded from the net) goes onto a raid0 array that I used to concat 
three
smaller partitions that
Don't assume from the fact that you have not YET had a failure on raid0 
that
it is any safer than LVM.  Its about the same risk.  Loss of any of one of
the partitions may cause loss of ALL data.
Depending on what file system you format the raid0 with it could be really
serious to just have a couple sectors go bad.
Raid0 composed of 3 drives TRIPLES you chance of loss, because a fault
on any ONE drive may render the whole thing borken.  If you had a
1 in 10000 chance of a drive failure previously, you now have a 3 in 10000
chance.
Indeed.
Further, I think a more accurate and scarier way to represent it is:
If the MTBF of one drive is 600,000 hours,
Then MTBF of a 3 drive raid0 is only 200,000 hours.
(600k is a typical estimate for commodity sata drives)

Worse, commodity sata drives only have a duty cycle of 30%
So, if you are running these 24/7 instead of 8/7 then the individual mtbf 
drops to merely 200,000 hours and the mtbf for the array drops to merely 
66,666
So the lifetime of the array is only a little better than 10% of the 
nominal/advertised lifetime of a drive.
And, on top of all that, remember the M in MTBF, MEAN time before failure. 
That 66,666 hour estimate is the average, so half of all such arrays will 
die even sooner, much sooner.

7.6 years sounds like a long time but that's total drive failure. Data 
corruption happens long before that.
I don't know where they get those huge mtbf estimates anyways. I see drives 
fail all the time in as little as a year. Some last 10 years, true, but many 
last 1, 2, or 3. If your power conditioning, air temperature and cleanliness 
aren't all *perfect* that surely drops all the numbers way down too. Running 
hot and suffering power fluctuations and surges both on the power connector 
and on the data connector definitely kills drives early, and what most 
people have in their homes is pretty bad power, pretty dusty air, and not 
cold enough nor enough air flow. Those ridiculously long mtbf estimates are 
probably simply whats required just to make a drive last a year or so in 
normal conditions.

Don't bet that your ups does any power conditioning either. The cheap ones 
mostly don't. They are simply switches and as long as there is power 
available from the wall, you are directly connected to the wall. Maybe there 
is a little surge absorbtion in play like what a cheap power-strip has, 
which is just about worthless for the purposes of this topic. It's value is 
that maybe you don't lose you whole room full of hardware when lightening 
hits your circuit. It does just about nothing for the 24/7 general dirtiness 
of most wall power, which gradually kills hardware a lot sooner than if the 
power was perfect 24/7 over the same period of time.

I'm seeing one out of ten drives die within 3 years even _in_ perfectly 
controlled and protected environment, consistent low temperature, good 
strong airflow over the drives, 100% power conditioning ups's, closed room 
(no constant influx of new dust) so the parts all stay clean, And that's 
with 100% duty cycle 5 year warranty u320 scsi drives not just commodity ide 
and sata drives. By die I also mean merely that the raid card they are 
connected to has marked them bad, meaning it detected a single data 
discrepency. That's a far cry from total drive failure and a lot easier to 
happen and happens a lot sooner on average.

Conversely, I have seen linux's software raid mark drives bad when really 
there was nothing wrong with them. Depending on the controller I've seen 
dmraid mark up to 50% of drives bad when they were really all 100% ok. Those 
same exact drives, on the same exact motherboards & cases, in the same exact 
server farm/power/air temp/etc..., running the same exact OS & software, but 
plugged into a real raid card instead of using software raid, the drives 
were fine and still are to this day so far, under heavier load actually 
since the servers in question never made it out of testing/vetting while the 
drives were "dying" so often, but are in full production now. That was just 
using raid10 in software too, not even the extra complication of raid5.

Raid0 has it's uses, but it definitely should be used with very open eyes 
and the acceptance that the array will likely die and all data will be gone 
in as little as a year or maybe three. Just do whatever you have to to 
somehow arrange to be ok with that.

-- 
Brian K. White    brian@aljex.com    http://www.myspace.com/KEYofR
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro  BBx    Linux  SCO  FreeBSD    #callahans  Satriani  Filk!

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org