Andrei Verovski (aka MacGuru) wrote:
Hi !
Hi Andrei,
I've got a Linux server with failed 2-disk software RAID (and valuable data). fsck.ext3 tries to check it on startup but fails with "Unrecovered Read Error - Auto reallocate failed" errors. Looks like some blocks gone bad.
fstab entries looks reads:
/dev/md0 /home ext3 acl, user_xattrs /dev/sda1 /data2 ext3 defaults 1 1 /dev/sda2 /data3 ext3 defaults 1 1
First of all, I could not figure out - is this software RAID 0 or RAID 1
execute 'cat /proc/mdstat' Then you get intelligible information about it.
(it is not mountable so I cannot figure out MD0 refers to RAID device number not to its level)? if this is RAID 1 I can simply remove HDs one by one and eliminate one with bad blocks. If this is RAID 0, situation seems to be worse.
Lets hope is a RAID1 but, I think that if it was RAID1, the failing HD would probably already have been kicked out from the array. Hope it's a RAID1 and lets hope I'm wrong...
Anyone have any idea how to recover it?
Well, first of all, cease any activity on any of HD that contain the array. DO NOT EVER run fsck on top of failing HD. It can damage your precious files even further. Do the following (other may suggest another method, of course ): - Boot with openSUSE 11.0 rescue system found on the Install DVD/CD - Try to assemble all your arrays with mdadm. Check first if they are not already assembled at boot. - mount your RAID filesystem with the read-only attribute, like 'mount -t ext3 -o ro /dev/md0 /mnt' - cd onto /mnt - tar the entire filesystem to another HD, like 'tar --preserve -zcvf destinationdir/destinationfile.tar.gz * > destinationdir/destinationfile.log.txt' Now if you have unrecoverable errors, tar will claim that the files containing the errors were padded with zeros. Check the log file to see witch were the unlucky ones. Well, now you have the array filesystem on a gziped tar archive. Almost all files should be binary equal to those present on the array filesystem. The exceptions are the files tar tar padded with zeros. Check if the padded files are expendable. If so, you just got lucky. If not, then another approach is needed. Just try the few steps I mentioned and report back the results.
Now I am running "fsck.ext3 -p -v -c /dev/md0", but its seems it cannot eliminate bad blacks.
fsck.ext3 will not correct/eliminate bad blocks. When an 'Auto Relocate failed' error is reported usually means that the HD was not able to relocate those sectors to a clean/safe one. I guess other can explain better or correct me: When a HD detects that a specific sector cannot be read, it tries a few consecutive reads on that sector. If it succeeds it will copy the data to a free ( never used ) sector and remap the internal sector map so that specific never gets used again. The only thing fsck.ext3 can do is force another relocation tries but, the result can be worst, and other files/sectors can be damaged.
Anyone have any idea what to do next?
Seems you have a few hours of work. Good luck...
Thanks in advance for any suggestion(s)
-- Rui Santos http://www.ruisantos.com/ Veni, vidi, Linux! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org