On Thu, Jan 8, 2009 at 3:11 PM, David C. Rankin
Greg Freemyer wrote:
I joined the mdraid mailing list a couple months ago. I mostly lurk.
Most of the problems I see people have is when they try to grow /
reshape a raid array, or when they have serious problems and try to
Surprising how many people post about experiencing a dual disk failure
and need to try to rebuild without restoring backups. Apparently
mdraid is a little aggressive about kicking drives out of the array,
because often they are able to get the array operational again with
exactly the same hardware.
I'm still a hardware raid guy, and having monitored the mdraid list
for a couple months, I'm likely to stay one until mdraid becomes more
tolerant of transient issues.
To be fair, hardware raid may be just as problematic in situations
where power supplies, cables, external enclosures are behaving
erratically. I just have not watched their support channels so
I don't beat up on my drives or anything like that, but my experience with
software raid has truly been that I set them up, check them once, and then
forget about them. I have had some spinning for 2 years now without a hiccup.
(knock on wood).
I am fairly cautious though. When I set one up, I used matched pairs of drives
(nothing special, just 2 of the same type ordered at the same time) check that
all is good with mdadm or dmraid or just watch the bios output if it is a bios
raid, and that is all the thought my arrays ever get.
The only death nail I see, which would hold true for hardware of software
raid, is memory corruption that spews gibberish all over you drives. Raid or
not, if that happens, you better have a backup handy...
In the last year I have / heard of a lot of bad "hard drive batches".
In particular we bought a batch of Seagates and several of them failed
in the first 50 hours of use. In one case I had 2 copies of the same
important data. Both drives holding it failed within 8 clock hours of
each other and I was left without a copy. Fortunately I swapped the
drive electronics of one of the failed drives with another working
drive and was able to copy the data off.
Thus, per the mdraid list the current recommendation is actually to
use drives from different batches and even from different vendors.
The idea being you reduce the likelihood of a double failure and in
turn data loss.
A lot of people are also moving to Raid 6 because the reliability of
drives seems to have gone down so significantly in the last 12 months.
Mostly it is tied to specific models, so you can get the impression
that all is well if you are not surveying a large number of models.
From what I've read I think all of the vendors are
putting out the
occasional bad model right now as they are pushing the limits on
capacity / speed. And since a given model is not in production all
that long it is hard to know the quality of what you are about to buy.
Also, Sata is far more demanding of power supplies than IDE drives
were. So lot of people are building servers with 4 or 8 drives in
them and having mdraid kick out multiple drives due to a single
mis-behaving component (the PSU). This was a bigger issue a couple
years ago than it is now.
I don't know if users have gotten smarter, or if the PSU manufacturers
are doing a better job, or maybe the sata drive makers have made the
drives more tolerant of power fluctuations.
Litigation Triage Solutions Specialist
First 99 Days Litigation White Paper -
The Norcross Group
The Intersection of Evidence & Technology
To unsubscribe, e-mail: opensuse+unsubscribe(a)opensuse.org
For additional commands, e-mail: opensuse+help(a)opensuse.org