Re: [opensuse] 10.2 no RAID to 11.0 RAID 1

22 Sep 2008

      ----- Original Message ----- 
From: "Andrew Joakimsen" 
To: "Rui Santos" 
Cc: 
Sent: Monday, September 22, 2008 11:54 AM
Subject: Re: [opensuse] 10.2 no RAID to 11.0 RAID 1
...
On Mon, Sep 22, 2008 at 5:52 AM, Rui Santos  wrote:
...
3) Interoperability. With most Intel's chipsets you can hotplug your
HD's. Just use the mdadm tool to remove the HD from the array, remove
the HD, add a new one and rebuild your array. You can do this while your
machine is in production mode. I believe you are unable to do that with
a mobo pseudo-raid ?
You forgot to mention that mdadm is CRAP. I will de-sync the array for
no reason and there is no way to force a rebuild. You will end up
having to rebuild the array. Raid 0 &  5 will suffer full data loss.
Only partially true.

It's perfectly possible to force a rebuild.
In fact, you can force rebuilds in mdadm in situations where no firmware raid will ever let you.
If you don't know how, that's a you problem not an mdadm problem.

As for de-sync for no reason, yes, that is weakness of linux's software raid.
Aside from it happening to me in an empirically proveable way, when the question was posed to engineers from adaptec and lsi, without pre-loading their thoughts by describing any symptoms, they predicted exactly the symptoms I was getting. I got this 2nd hand from a hardware vendor and system builder (Seneca Data) so it's fuzzy talk because the engineers were trying to talk layman, and the guy at Seneca was only barely able to follow, but the gist was that at the low level, linux handles the disks differently and is quicker to assume a disk is bad or that a particular operation has failed, where all the hardware raid cards take more active control of the disks and are more forgiving of transient disk (mis)behavior, such as an op not completeing within a certain time window or an op failing once but succeeding if simply immediately repeated. 

This theory turns out to exactly agree with behaviour I saw on a set of 10 identical servers that started out with 8 sata drives each, hooked up as 4 on the motherboard sata controller (nvidia) and 4 on a pci-e LSI card. All plain sata, no hardware raid or fake-raid.
All boxes were loaded up with exactly the same software and configuration via autoyast / autoinst.xml
opensuse 10.3 i386 with software raid0:swap, raid1:/boot and raid5:/
All boxes had randomly failed drives, some boxes couldn't even finish the install process before at least one drive went bad, others would run a few days and then have one or more failures under no load, only 2 of them never had a problem.

The first few drives of course I tried actually swapping in new drives and rebuilding, other times I just forced the existing drives to rebuild (yes, contrary to your claims, it' perfectly possible and works fine). Then I tried moving drives around to see if there were perhaps flaky hot-swap bays or connectors. Then I tried raid10 instead of 5. 

After several weeks of this and after hearing the adaptec engineers theory we decided to take a chance and buy 10 adaptec 3805 pci-e raid cards.
Reinstalled the exact same OS and configuration aside from using aacraid instead of libata and mdraid, onto the exact same drives, same backplanes and hotswap bays and the rest of the server. Same power and cooling environment even... and never had even one single problem on any drive on any server even once since then, and now they are all in heavy production for several months.

So, clearly the drives weren't really bad, yet linux software raid marked them failed left & right.

However, it's also true that only certain hardware combinations may tickle this software weakness.
I have several other machines in heavy production using purely software reaid, sometimes raid 5 sometimes 10, that have been cranking away for a couple of years now without a blip. They are using different low level hardware and drivers, and sometimes different (2 years old) versions of linux. 

So, mdraid isn't necessarily "crap" , it just has compatibility quirks like everything else on the planet.

And as for recommending to use or avoid it, as I said before, yes, you are making the right call that you should probably not use it.
Other people however should make their own call based on something other than the fact that you don't know how to use mdadm.
Just like I should probably not attempt to fly a helicopter. They are complicated and take a long time to learn to use, and I don't know how to fly them, yet I'm pretty sure they aren't just "crap" as a whole class.
I can, and have, fix problems in mdadm that no hardware raid will even let you think about.
You can set up raid arrangements that no hardware raid can possibly do.
You can perform operations on live running systems that no hardware raid array can possibly do.
A software raid array can be copid and run on any hardware linux supports. (In fact, you can do that, while mounted, live & running. Try moving a hardware raid array with mounted filesystems from a 3ware card to an nfs share, without any interruption.)

Crap is your opinion and you're free to express it, but you should stop claiming that things are impossible just because you couldn't figure it out or didn't want to spend the necessary time learning, which in this case pretty much requires experimenting and testing in a methodical manner, not just reading the mdadm man page, though It definitely starts with that.

Brian K. White    brian@aljex.com    http://www.myspace.com/KEYofR
+++++[>+++[>+++++>+++++++<<-]<-]>>+.>.+++++.+++++++.-.[>+<---]>++.
filePro  BBx    Linux  SCO  FreeBSD    #callahans  Satriani  Filk!

-- 
To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org
For additional commands, e-mail: opensuse+help@opensuse.org