boot problem - dying HD ?
From one day to another the system could not boot. Instead gave a lot of error messages and asked for the root
Hello all: I have a serious problem with a system running SUSE 9.2. password finally saying that the root file system was corrupted. Auto fsck failed on the root filesystem. I booted from a SUSE 9.2 rescue/install disk selecting rescue mode and tried to save all the files from the root filesystem to another partition before manually run fsck. It did not work and I got a lot of i/o errors. I took out this HD (let's call HD1) from the computer and connected to another MB in another computer and now I am trying to boot that computer up but is also fails. (That computer has a very same HD, let's call it HD2.) Without connecting HD1 to that computer it boots normally from its own HD2. Both HDs are SATAs and the BIOS sees them as first IDE master and slave. Right now my problem is that if I connect the possibly bad drive HD1 to the other system it also fails to boot correctly. The boot process is very slow and I get the following error messages during boot: Buffer I/O error on device sdb2, logical block xxxxxx ata1: status 0x51 { DriveReady SeekComplete Error } ata1: error=0x40 { Uncorrectable Error} scsi 0 : ERROR on channel 0, id1, lun0 current sdb: sense key Medium error Additional sense: Unrecovered read error: auto reallocate failed end request: I/O error, dev sdb sector xxxxxxxx and it goes like this for a really long time with different xxxxxx values. Finally I got error messages that processes could not been started since a read-only filesystem. If I disconnect 'sdb' (HD1) and only sda (HD2) is connected, boot process is OK. Questions: 1. Does these messages mean that the HD1 is dying? 2. How could I disable sdb check to be able to boot normally from sda? 3. Why the connection of sdb2 causes the system not to boot normally from sda? Thanks, IG ps: Sorry for being verbous. ______________________________________________________________________________ Hihetetlen! Sikerkönyvek 1 napos házhozszállítással! 3 könyv rendelése esetén összesen 20% kedvezmény! http://www.bookline.hu/?affiliate=frenapkar982
On 11/22/05, Istvan Gabor
Hello all:
I have a serious problem with a system running SUSE 9.2. From one day to another the system could not boot. Instead gave a lot of error messages and asked for the root password finally saying that the root file system was corrupted. Auto fsck failed on the root filesystem.
[snipped lots of lines with useful info]
Questions:
1. Does these messages mean that the HD1 is dying?
Yep! I'd say so.
2. How could I disable sdb check to be able to boot normally from sda?
I suppose you'd have to set the last field in /etc/fstab to 0 (zero) turning of fsck at boot time.
3. Why the connection of sdb2 causes the system not to boot normally from sda?
By mounting the filesystem at startup the driver probes the drive
regularly and performs some autonomous tasks that lead to errors with
damaged disks. The same errors will occur when mounting the
particular filesystem manually.
To put it straight: an "auto reallocate" error is a serious physical
error; the disk has lost the data on some of the blocks. Get a new
drive.
\Steve
--
Steve Graegert
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Tuesday 2005-11-22 at 18:06 +0100, Istvan Gabor wrote:
Buffer I/O error on device sdb2, logical block xxxxxx ata1: status 0x51 { DriveReady SeekComplete Error } ata1: error=0x40 { Uncorrectable Error} scsi 0 : ERROR on channel 0, id1, lun0 current sdb: sense key Medium error Additional sense: Unrecovered read error: auto reallocate failed end request: I/O error, dev sdb sector xxxxxxxx
I think that disk has sectors with read errors, and the disk is trying to reallocate them. That is relatively normal, but it says that it failed, and that is indeed worrying. It could be that the space reserved for reallocation is spent. It would be interesting to read the smart log of that disk, but I think that sata support is not included/finished in smartctl.
1. Does these messages mean that the HD1 is dying?
I think so. - -- Cheers, Carlos Robinson -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFDg3rNtTMYHG2NR9URAuIQAJ4hqSLEYHgMdBnIwijHnXNncBsGMwCeP6Kw EIL+dUQssdsB7D4gLpRtvLI= =UQBp -----END PGP SIGNATURE-----
On 11/22/05 3:08 PM, "Carlos E. R."
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
The Tuesday 2005-11-22 at 18:06 +0100, Istvan Gabor wrote:
Buffer I/O error on device sdb2, logical block xxxxxx ata1: status 0x51 { DriveReady SeekComplete Error } ata1: error=0x40 { Uncorrectable Error} scsi 0 : ERROR on channel 0, id1, lun0 current sdb: sense key Medium error Additional sense: Unrecovered read error: auto reallocate failed end request: I/O error, dev sdb sector xxxxxxxx
I think that disk has sectors with read errors, and the disk is trying to reallocate them. That is relatively normal, but it says that it failed, and that is indeed worrying. It could be that the space reserved for reallocation is spent. It would be interesting to read the smart log of that disk, but I think that sata support is not included/finished in smartctl.
SATA disks are actually pretty well supported by smartctl; I use it to run scheduled tests, and health monitoring, on about 35 SATA systems. To the OP: try doing `smartctl -l /dev/sdb` -- that should give you an extended printout of the errors that the disk has recorded.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Tuesday 2005-11-22 at 17:22 -0500, Ian Marlier wrote:
SATA disks are actually pretty well supported by smartctl; I use it to run scheduled tests, and health monitoring, on about 35 SATA systems.
I'm glad to hear that. The README.SATA led me to think the contrary - I see now that it doen't work only for some: README for S.M.A.R.T. on SATA discs Smartmontools should work correctly with SATA drives under both Linux 2.4 and 2.6 kernels, if you use the standard IDE drivers. If you use libata drivers, you need at least kernel 2.6.11. For older versions it won't work correctly because libata didn't support the needed ATA-passthrough ioctl() calls. For more read http://smartmontools.sourceforge.net/. ... If, however the IDE driver doesn't support your particular SATA controller, or the controller doesn't have a legacy interface at all, then only libata can be used. As far as we know, the IDE driver only works on Intel, VIA and nVidia controllers. Perhaps the situation has improved in SuSE 10. Could you expand on it? - -- Cheers, Carlos Robinson -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFDg66ftTMYHG2NR9URAt71AJ9YRNGILcTjed3M7mvT8ewuu6Zo/wCeI9G/ EMVOm9jWJO6bewqYLLZXQig= =R2qn -----END PGP SIGNATURE-----
On 11/22/05 6:49 PM, "Carlos E. R."
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
The Tuesday 2005-11-22 at 17:22 -0500, Ian Marlier wrote:
SATA disks are actually pretty well supported by smartctl; I use it to run scheduled tests, and health monitoring, on about 35 SATA systems.
I'm glad to hear that. The README.SATA led me to think the contrary - I see now that it doen't work only for some:
README for S.M.A.R.T. on SATA discs
Smartmontools should work correctly with SATA drives under both Linux 2.4 and 2.6 kernels, if you use the standard IDE drivers. If you use libata drivers, you need at least kernel 2.6.11. For older versions it won't work correctly because libata didn't support the needed ATA-passthrough ioctl() calls. For more read http://smartmontools.sourceforge.net/.
...
If, however the IDE driver doesn't support your particular SATA controller, or the controller doesn't have a legacy interface at all, then only libata can be used.
As far as we know, the IDE driver only works on Intel, VIA and nVidia controllers.
Perhaps the situation has improved in SuSE 10. Could you expand on it?
I'm actually running 9.3 Pro on all of the aforementioned systems. Kernel is 2.6.11.4-20a. Some of the systems use the standard IDE disk driver; others use libata. The drives are all Western Digital, ranging from 36GB at the small end to 350GB at the high end. They're variously connected to 3ware SATA RAID cards, Silicon Integrated controllers (sata_sil will appear in modprobe if your machine has one of these), and some other kind of a controller that I can't remember off the top of my head. I've also tested smartctl with another manufacturer's drive, in one of these systems, and it worked fine there, too. I think that drive was a Seagate, but don't quote me on it. The only thing that doesn't always work is drive manufacturer/model identification; quite a few SATA drives just aren't in the smartctl database. However, the test and monitor functions work fine regardless. Basically, you should be able to use smartctl with a SATA drive on a 9.3 Pro system or newer. Below that, and you're going to have to do some playing with kernel versions and alternate driver modules in order to get it going.
Ian, On Tuesday 22 November 2005 14:22, Ian Marlier wrote:
...
SATA disks are actually pretty well supported by smartctl; I use it to run scheduled tests, and health monitoring, on about 35 SATA systems.
From my /var/log/messages (there's plenty of more where this came from): Nov 15 07:25:52 twain smartd[16797]: Device /dev/sda, SATA disks accessed via libata are not currently supported by smartmontools. When libata is given an ATA pass-thru ioctl() then an additional '-d libata' device type will be added to smartmontools.
....
Randall Schulz
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 The Tuesday 2005-11-22 at 18:38 -0800, Randall R Schulz wrote:
From my /var/log/messages (there's plenty of more where this came from):
Nov 15 07:25:52 twain smartd[16797]: Device /dev/sda, SATA disks accessed via libata are not currently supported by smartmontools. When libata is given an ATA pass-thru ioctl() then an additional '-d libata' device type will be added to smartmontools.
That's what I heard. There are then some that work and some that don't. - -- Cheers, Carlos Robinson -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.0 (GNU/Linux) Comment: Made with pgp4pine 1.76 iD8DBQFDhwlZtTMYHG2NR9URAl4kAJ4hMZy7b2tggQ0TKBTg+2Zamk1uKACeMCVE KWILthupC0UxuMwXmD8sjmw= =7cNd -----END PGP SIGNATURE-----
Istvan Gabor
I have a serious problem with a system running SUSE 9.2. From one day to another the system could not boot.
[snipped] Thanks for the helpful answers. I replaced my HD. IG ______________________________________________________________________ KGFB 2006 - Garantáltan a legjobb ár! Nyerje meg az új Swiftet + garantált 10,000,- Ft értékű ajándék. WWW.NETRISK.HU
participants (5)
-
Carlos E. R.
-
Ian Marlier
-
Istvan Gabor
-
Randall R Schulz
-
Steve Graegert