Hard Drive Kernel Error Messages?
Hi list, Recently put 8.1 on a used PIII 1Ghz box and have just had a crash. I noticed the following in my /var/log/messages file: Aug 5 10:58:00 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:00 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490808 Aug 5 10:58:00 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490808 Aug 5 10:58:06 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:06 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490816 Aug 5 10:58:06 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490816 Aug 5 10:58:12 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:12 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490824 Aug 5 10:58:12 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490824 Aug 5 10:58:17 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:17 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490832 Aug 5 10:58:17 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490832 Aug 5 10:58:23 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:23 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490840 Aug 5 10:58:23 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490840 Aug 5 10:58:29 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:29 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490848 Aug 5 10:58:29 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490848 Aug 5 10:58:34 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:34 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490856 Aug 5 10:58:34 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490856 Aug 5 10:58:40 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:40 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490864 Aug 5 10:58:40 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490864 Aug 5 10:58:45 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:45 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490872 Aug 5 10:58:45 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490872 Aug 5 10:58:51 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:59:04 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490880 Aug 5 10:59:04 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490880 Aug 5 10:59:04 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:59:04 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041501, sector=10490888 Aug 5 10:59:04 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490888 Aug 5 10:59:04 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:59:04 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041509, sector=10490896 Aug 5 10:59:04 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490896 I notice this happening since August 2nd about every 5 minutes, the log goes back to July 23. I also notice this every now and then: Aug 5 11:16:58 langly kernel: cdrom: open failed. Finally, I just realized that on bootup, that this drive is configured as IDE Primary Slave not Master (this is how it was when I got it). I have a Secondary Master CDROM drive. Could this be causing problems? Here is output from hdparm: langly /var/log# hdparm -i /dev/hdb /dev/hdb: Model=WDC WD300AB-00BVA0, FwRev=21.01H21, SerialNo=WD-WMA7H1415190 Config={ HardSect NotMFM HdSw>15uSec SpinMotCtl Fixed DTR>5Mbs FmtGapReq } RawCHS=16383/16/63, TrkSize=57600, SectSize=600, ECCbytes=40 BuffType=DualPortCache, BuffSize=2048kB, MaxMultSect=16, MultSect=16 CurCHS=16383/16/63, CurSects=16514064, LBA=yes, LBAsects=58633344 IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:120} PIO modes: pio0 pio1 pio2 pio3 pio4 DMA modes: mdma0 mdma1 mdma2 UDMA modes: udma0 udma1 udma2 udma3 udma4 *udma5 AdvancedPM=no WriteCache=enabled Drive conforms to: device does not report version: 1 2 3 4 5 Any help would be appreciated. Josh
On Tue, 5 Aug 2003 11:40:22 -0500
Josh Trutwin
Any help would be appreciated.
I suggest you boot with the rescue system and run badblocks. If you are using ext2/3 you can just run fsck with the -c option, that will run badblocks and mark them in one go. If you are using Reiserfs, sorry I can't help you. You can run hdparm with the -D1 option in both cases to see if it helps. If there are no badblocks, you can try turning dma off with hdparm -d0 and see if the problem goes away. Charles -- "MSDOS didn't get as bad as it is overnight -- it took over ten years of careful development." (By dmeggins@aix1.uottawa.ca)
On Tue, 5 Aug 2003 13:16:33 -0400
Charles Philip Chan
On Tue, 5 Aug 2003 11:40:22 -0500 Josh Trutwin
wrote: Any help would be appreciated.
I suggest you boot with the rescue system and run badblocks. If you are using ext2/3 you can just run fsck with the -c option, that will run badblocks and mark them in one go. If you are using Reiserfs, sorry I can't help you. You can run hdparm with the -D1 option in both cases to see if it helps. If there are no badblocks, you can try turning dma off with hdparm -d0 and see if the problem goes away.
Thanks, It is reiser, and there is a reiserfsck program that I've been manning that I can use to check for bad blocks ala fsck. Does booting in safe mode mount all partitions read-only? Some of the reading I've done on google for this error indicate that it could be a simple IDE cabling problem, or it could be a bad drive... RE: zentera's post, I'm not running Windohs on this drive, just SuSE, so that's not an issue, thanks for the suggestion though. Josh
On Tue, 5 Aug 2003 13:02:20 -0500
Josh Trutwin
It is reiser, and there is a reiserfsck program that I've been manning that I can use to check for bad blocks ala fsck. Does booting in safe mode mount all partitions read-only?
Boot from the SUSE CD and choose the rescue system. You can then run the reiserfsck program on each partitions of the unmounted drive. Charles -- "MSDOS didn't get as bad as it is overnight -- it took over ten years of careful development." (By dmeggins@aix1.uottawa.ca)
Boot from the SUSE CD and choose the rescue system. You can then run the reiserfsck program on each partitions of the unmounted drive.
Which reports that it cannot read one of the block in the root partition. Sigh. Guess I'll go get a new drive.... Is it possible to do a copy from one drive to another without having to reinstall SuSE? Josh
On Wed, 6 Aug 2003 09:47:27 -0500
Josh Trutwin
Boot from the SUSE CD and choose the rescue system. You can then run the reiserfsck program on each partitions of the unmounted drive.
Which reports that it cannot read one of the block in the root partition. Sigh. Guess I'll go get a new drive....
Heh, I'll mention it again....try to zero it all out before you buy a new one. First kill off the boot sector dd if=/dev/zero of=/dev/hda bs=512 count=1 then fdisk it into 10 gig partitons, for example then format each partiton and run something like this on each mounted partition #!/bin/sh dd if=/dev/zero of=/mnt/1 bs=8192 dd if=/dev/zero of=/mnt/2 bs=8192 dd if=/dev/zero of=/mnt/3 bs=8192 dd if=/dev/zero of=/mnt/4 bs=8192 dd if=/dev/zero of=/mnt/5 bs=8192 That will fill a 10 gig partition. Then delete the files and run your filesystem check again. You may be surprised. If you succeed, you will be like me, and start dreaming up theories why. -- I'm not really a human, but I play one on earth.
The 03.08.05 at 13:02, Josh Trutwin wrote:
It is reiser, and there is a reiserfsck program that I've been manning that I can use to check for bad blocks ala fsck. Does booting in safe mode mount all partitions read-only?
Can reiserfsck search for badblocks and mark them unusable or something? I haven't seen that option. -- Cheers, Carlos Robinson
On Thu, 7 Aug 2003 01:22:43 +0200 (CEST)
"Carlos E. R."
The 03.08.05 at 13:02, Josh Trutwin wrote:
It is reiser, and there is a reiserfsck program that I've been manning that I can use to check for bad blocks ala fsck. Does booting in safe mode mount all partitions read-only?
Can reiserfsck search for badblocks and mark them unusable or something? I haven't seen that option.
I think that feature is more low-level than the filesystem. "man hdparm" describes the -D option to enable this. Most drive manufacturers have a bootble disk to set various drive features, and marking bad blocks is usually one of the features. ########################################################## P.S. I have a paranoid rant about these "bad blocks" in suse-ot, if you are interested. I think something "undocumented" is going on. -- I'm not really a human, but I play one on earth.
The 03.08.07 at 09:28, zentara wrote:
Can reiserfsck search for badblocks and mark them unusable or something? I haven't seen that option.
I think that feature is more low-level than the filesystem. "man hdparm" describes the -D option to enable this.
Ah, I hadn't noticed before it had to be enabled. -D Enable/disable the on-drive defect management feature, whereby the drive firmware tries to automatically manage defective sectors by relocating them to "spare" sectors reserved by the factory for such. But... the ability of hardware to do that is relatively new. "Old" filesystems, like msdos fat, can easily mark sectors as unusable. Linux ext2 can do it, either when creating the filesystem or when testing it: -c This option causes e2fsck to run the badblocks(8) program to find any blocks which are bad on the filesystem, and then marks them as bad by adding them to the bad block inode. If this option is specified twice, then the bad block scan will be done using a non-destructive read-write test.
Most drive manufacturers have a bootble disk to set various drive features, and marking bad blocks is usually one of the features.
I haven't seen it on the one by Seagate... I'll have to look harder.
P.S. I have a paranoid rant about these "bad blocks" in suse-ot, if you are interested. I think something "undocumented" is going on.
Ah... I'm resisting the urge to subscribe O:-) Perhaps I'll have a look at the archive. :-) -- Cheers, Carlos Robinson
On Tue, 5 Aug 2003 11:40:22 -0500
Josh Trutwin
Hi list,
Recently put 8.1 on a used PIII 1Ghz box and have just had a crash. I noticed the following in my /var/log/messages file:
Aug 5 10:58:00 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error } Aug 5 10:58:00 langly kernel: hdb: dma_intr: error=0x40 { UncorrectableError }, LBAsect=18041498, sector=10490808 Aug 5 10:58:00 langly kernel: end_request: I/O error, dev 03:46 (hdb), sector 10490808 Aug 5 10:58:06 langly kernel: hdb: dma_intr: status=0x51 { DriveReady SeekComplete Error }
I just ran into this, it was caused by windows (probably running defrag, or maybe some virus). I had to completely zero out the harddrive, it will work again, at least it did for me. You need to write zeroes over the entire harddrive, including the boot sector. When will I learn not to run windows on the same disk as linux? -- I'm not really a human, but I play one on earth.
participants (4)
-
Carlos E. R.
-
Charles Philip Chan
-
Josh Trutwin
-
zentara