Re: [LILO,HD] BIG BIG EMERGENCY!
"Michael H. Warfield" schrieb:
On Mon, Oct 01, 2001 at 08:23:24PM +0200, Oliver Ob wrote:
I repost this because people advise me to simply look inside that partition (var log etc) and i _stress_ again_ that: 1) i cannot read hda 2) i cannot write hda _(fdisk) 3) i cannot boot hda
ALL (!!!!!) repeat: ALL i did was change boot order and run lilo
NOTHING has been done on the hardware which worked 5 mins before.
sorry, but i had to stress that because most advisory went into those directions.
WHAT HAPPENED: ============== I am unable to access the harddrisk hda, I cannot read the partition table since:
WHAT I DID: =========== I just executed lilo after changing the boot _order_ (boot dos second, linux first), nothing extraordinary really. Lilo completed correctly as always, but since then I cannot boot nor even access any partition n that hda.
ERRORS: ======= error 0x40 uncorrectable error LBAsect=9 sector=0 end_request: I/O error dev 03:00 (hda) sector unable to read partition table
This is a hardware error. It's saying that the drive is reporting an uncorrectable (that's as in, retries and ECC have both failed to correct the error on the drive) error and can not read the data. The failed sector happens to be in the partition table.
you are right.
AND NOW?: ========= Running fdisk /dev/hda says "cannot read partition data" Even trying with DOS6.22' DOS-FDISK (fdisk /mbr) did not bring me on, since it complains it cannot read/write the partition table.
Yup... Everything points to a bad sector error on the disk.
true. but I wonder why i cannot even REwrite pt and mbr
EMERGENCY!!!!!!!!!!!!!!!!!!!!!!!!!!! REAL TROUBLE!!!!!!!! ========================================================= I need my data back. I cannot repartition hda, is there any tool to help me out of this desaster? Maybe this is a lilo bug? Since I have not done any thing to the data inside /etc/lilo.conf, only changed the order of appearance!!!! not more!
It is so unlikely to be any sort of a software bug it defies belief. There literally is nothing a "program" can do to corrupt a drive sector at that low level since ECC (Error Check Codes) are generated whenever data is written and SYNC patterns (low level head calibration patterns) are done when the drive is low-level formated / initialized (requires special vendor utilities).
you got me wrong. i meant maybe LILO contains a bug and maybe lilo corrupted that mbr and pt. lilo neither is perfect. I say this, because runnning lilo was all i did. (reboot afterwards)
What may have happened is that something went wrong in the drive when lilo went to reinitialize the boot sector. Since that area of the disk is very rarely written to, it could have been developing a potential write error for a long time (this is often track synchronization errors and was the primary reason companies came out with things like "spin-rite"). Low level formating might recover that sector. Merely repartitioning is highly unlikely to.
ok. order of 2do is like 1. evacuate data 2. low level format (need a tool for that quantum scirocco 2550) 3. repartition 4. try to get hda going again.
HELP PLEEEEEEZ!
You may be in big trouble...
I've recovered drives like this in the past. It is NOT for the faint of heart. If it's what I think it is (corrupted sector in the partition table) it is recoverable, but it's not going to be easy.
yes. it seems to me too that there is some blocks damaged. badblocks -b512 /dev/hda 1000 gave me: 0 1 130 131 what does that tell you? (cannot access manpages at this time because they are on there...)
Several people have mentioned booting up from a bootable CD. You're gonna have to do it.
yes i did.. running rescue system on cd now.
If you have another IDENTICAL drive, this would make life a lot simpler. If you know how the drive was partitioned, even better. If
i am going to buy a bigger drive tomorrow. ======
you don't have an identical drive, it's going to be tougher, but not impossible. If you don't know what the partition table was, you are as long as the 2nd drive is larger there seems no problem for me.
going to require specialized tools to locate your partition boundries. The Coroners Toolkit (TCT) has some capabilities there and I've heard of other people who have mentioned other partition table tools as well.
where to dl? have you got a link handy?
IAC... If nothing else has worked (which it apparently has not) then you are forced to assume that cylinder 0 is toast and you are NOT getting it back. Only exception to that is as trick I describe below
read above: my values which badblocks gives me.
but which I don't recommend unless you are comfortable with the Bootable CDs and comfortable with using dd. Because that "trick" is going to yes i am.
attempt to blindly write over those bad sectors you are writing directly to the drive with data you are trying to recover. If you are NOT comfortable with doing that, then just continue at this point. If you want to try the trick, fine, just come back here if it didn't work.
ok first the trick.... LOVE tricks....
=========================================================================
We'll assume that you know the geometry of the drive. Take your C/H/S and multiply them together and divide by 2 to get the size of that first cylinder in 1K blocks. Booted off the bootable CD, you can use dd to copy from the bad drive (lets say /dev/hdb) to a file on a good drive (which is why the drive needs to be larger).
CYLSIZE will be the size of cylinder zero here... I'll also assume that you have a big fat file system mounted on /mnt/0 to which you can write a copy of your file system.
dd if=/dev/zero of=/mnt/0/oldfilesys bs=1024 count=$CYLSIZE dd if=/dev/hdb of=/mnt/0/oldfilesys bs=1024 skip=$CYLSIZE
If that second dd command gives you errors, your drive is in serious hurt and you've had a hard drive hardware failure. Recovery will be expensive and unlikely to be successful.
Now you should have an image of your drive on /mnt/0/oldfilesys with the first "cylinder" zeroed.
Now either low-level format your old drive or replace it with an identical new drive. Partition the drive to EXACTLY what it was before the failure (or get ready to learn TCT).
Now go back to your bootable CD and copy the data back...
dd if=/mnt/0/oldfilesys of=/dev/hdb bs=1024 skip=$CYLSIZE
Notes:
I put $CYLSIZE into the head of the file to allow for the case where you might try reconstructive work on the image itself. It might also help with TCT in resurrecting a viable image if you want to work directly on that image.
If you KNOW your partition table layout in cylinders and are real good at CHS -> sector number math, you can also try copyin off individual partitions to other identically sized partitions. This is really not for the faint of heart and I would only try it if I had everything documented to the nines and it was a final act of despiration.
Note... Preferred method of attack is to get the data copied to another identical drive. That way you don't run the risk of scragging that original drive and then discovering that you made a mistake in the copy process somewhere. If you have to use the single drive / lowlevel format approach, make sure the image contains good data and is intact, BEFORE begining the initialization.
=========================================================================
One last trick if you are really REALLY brave. It is much less work and can be tried quickly but is also not realy likely to work, but it might. I've seen it happen.
Boot from the Linux BBC or some other favorite bootable CD. Assuming that the drive is /dev/hda, run the following command:
ok
dd if=/dev/zero of=/dev/hda bs=1024 count=32
ok. verified. run! BLOOOOOD! ahahah
Note: If you leave off the "count=32" it will eat your entire drive irrecoverably (and do it really fast).
If you don't get an error from the above command, it may have laided down a clean zeroed sector over your partition table. NOW you may be able to go back and rebuild that partition table with fdisk, if you knew what it was. If you get ANY sort of error from that dd command, abort the effort and fall back to the suggestions above.
executed with no error. just 32 records in and out. now run and try fdisk WOWW!!!!!! GOD! I SEE FDISK RUNNING!!!!! one should have an option for fdisk to FORCE fdisk to run in such cases!! hope the programmers read this! only point: i have been now for FOUR hours searching my entire appartment, i lost that sheet where i wrote the partition table down. is there a tool to estimate partition sizes from their boundaries? that tool should scan entire hda and report 1st and last cylinder for any partition. is that popssible at all? at this time i am unsecure about editing an entire new pt. i have not my sheet anymore, but if I try and enter totally different values here, I guess it'll NOT work. it was more or less like this: hda: 128h 63sec 621cyl hda1: 6 256MB fat16 hda2: 83 ???MB linux main hda3: 82 32MB linux swap (or64?= hda4: 6 1.9GB dos data drive so varying all those values could end in even more desaster, so I await your advise and do not yet write the pt. isn't there TWO pt (one backup) on any hd? so why not simply copying the backup over the first? greetings and big big thanks for helping to all of you here!
Michael H. Warfield | (770) 985-6132 | mhw@WittsEnd.com (The Mad Wizard) | (678) 463-0932 | http://www.wittsend.com/mhw/ NIC whois: MHW9 | An optimist believes we live in the best of all PGP Key: 0xDF1DD471 | possible worlds. A pessimist is sure of it!
what ACNs are 770 and 678? leased ACNs? -- *º¤., ¸¸,.¤º*¨¨¨*¤ =Oliver@home= *º¤., ¸¸,.¤º*¨¨*¤ I http://www.bmw-roadster.de/Friends/Olli/olli.html I I http://www.bmw-roadster.de/Friends/friends.html I I http://groups.yahoo.com/group/VGAP-93 I I http://home.t-online.de/home/spacecraft.portal I
Telek0ma iBBMS - soon back online +49.4503.TRSi1/TRSi2 <<<
participants (1)
-
Oliver Ob