On Mon, Sep 22, 2008 at 5:59 PM, Brian K. White
----- Original Message ----- From: "Andrew Joakimsen"
To: "Brian K. White" Cc: Sent: Monday, September 22, 2008 4:05 PM Subject: Re: [opensuse] 10.2 no RAID to 11.0 RAID 1 On Mon, Sep 22, 2008 at 3:27 PM, Brian K. White
wrote: It's perfectly possible to force a rebuild. In fact, you can force rebuilds in mdadm in situations where no firmware raid will ever let you. If you don't know how, that's a you problem not an mdadm problem.
I know how and issue the right command. It say /dev/sdb3 or whatnot DOES NOT EXSIST.
But if you do ll /dev/sdb3 or even cat /dev/sdb3 the device is obviously there.
So yes mdadm is crap and should never be used. If you need to do mdadm /dev/md0 --fail /dev/sdb3 and it say sdb3 do not exist there is a serious issues of the developers piping their toilets into their code.
Wrong. (Unless you can supply enough exact commands and responses and other observations to prove your diagnostic process and deductions aren't full of holes. You have not done so above.)
I still have the drives. I am still looking for real instructions on how to use mdadm. One of the "step by step" guides even show one of the errors as normal output! So I figure what the hell let me continue anyways and of course it did not work.
I have seen a few different things that each were different problems, yet each could have been described roughly as above, and yet in each case the drive was not actually unavailable and all desired operations were able to be performed somehow. The exact steps varied in each case because the exact problem varied in each case. I don't know which of the exact problems you actually had, because as I said, there was just in my own little experience more than one way to get something roughly like that, so I can't say what exactly you could or should have done that would have worked.
Ah, so there is no universal test case. There has to be. Let's assume one drive is "bad" what then is the correct way to indicate this through mdadm and start the now "degraded" RAID-1 array?
This all assumes good hardware btw. A buggy disk or controller could actually make a disk appear bad and then later good again or good then lock up etc.. As far as I'm concerned, you could have bad hardware even. You are saying something doesn't work, but you are not showing your deductive process and so the claim is meaningless. Send me your problem disks that you think are impossible to assemble and I bet in a little while I can tell you how to assemble the array as long as there actually is enough there to use. (if you did something stupid and blew away metadata that can't be recreated or inferred, well no hardware raid card will save you from that either.)
All I can say is the systems have an ASUS P4P800-VM mainboard (Intel 865G chip set). They ran Fedora for 2 years and then I replaced the hard drives and installed openSUSE on md RAID. Two systems physically 20 miles apart the same thing happens to. The hard drive manufacturer long test "passed" on all four drives. The fact that I can mount each of the partitions that made up /dev/md0 and the md5 of all important files on the system on both partitions (and just the fact that I can read the data off the individual partitions) further shows that is is not a hardware issue. I still have the drives, if I am wrong I have no problem admitting it.
And I'm not even slightly an mdadm guru. I simply spent a good solid weekend and then several smaller incidents experimenting. I would say it's still black art to me. But even at this level I already have actually performed actions you claim are impossible, and have seen symptoms like you decribe above, except I looked at the problem longer than 13 seconds and discovered the problem was not as it seemed and that it was prfectly solvable in every case so far. That includes those 10 boxes I was talking about. The disks kept failing randomly, but it was always possible to rebuild and rejoin them. It sometimes took some poking and insight. I'm not saying it was always obvious what to do or why. Just that it always turned out to be do-able even when it looked impossible based on the first and most obvious commands.
So far my assertion stands. You should not expect mdraid to work for you, but that has no bearing on other people or on mdraid itself. You are merely saying that because you don't know how to fly helicopters, that helicopters are garbage.
Prove me wrong. Because noone has been able to provide the proper commands to rebuild an array. There is no documentation on how to do it, the man page is vague and the commands dont work correctly. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org