[opensuse-kernel] Harddisk dying?
I see on my system (running 3.1 rc6) quite a lot of these in dmesg (complete dmesg attached): [27907.047446] ata3.00: exception Emask 0x40 SAct 0x1 SErr 0x880800 action 0x6 frozen [27907.047453] ata3: SError: { HostInt 10B8B LinkSeq } [27907.047457] ata3.00: failed command: READ FPDMA QUEUED [27907.047463] ata3.00: cmd 60/08:00:f0:1d:53/00:00:0b:00:00/40 tag 0 ncq 4096 in [27907.047465] res 40/00:00:40:fb:8c/00:00:14:00:00/40 Emask 0x44 (timeout) [27907.047468] ata3.00: status: { DRDY } [27907.047474] ata3: hard resetting link [27907.351451] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [27907.353803] ata3.00: configured for UDMA/33 [27907.364380] ata3: EH complete Is my harddisk dying? I have two harddisks - which ones of these is it (sda or sdb)? This part of dmesg did not help me to identify which disk is ata3 and which ata4: [ 1.757441] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.757470] ata4: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [ 1.758611] ata3.00: ATA-8: ST3750528AS, CC38, max UDMA/133 [ 1.758616] ata3.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 31/32) [ 1.758651] ata4.00: ATA-8: ST3750528AS, CC38, max UDMA/133 [ 1.758655] ata4.00: 1465149168 sectors, multi 16: LBA48 NCQ (depth 31/32) [ 1.759050] ata5.00: ATAPI: ATAPI iHAS124 Y, BL0V, max UDMA/100 [ 1.760038] ata3.00: configured for UDMA/133 [ 1.760065] ata4.00: configured for UDMA/133 [ 1.760224] scsi 2:0:0:0: Direct-Access ATA ST3750528AS CC38 PQ: 0 ANSI: 5 [ 1.760382] sd 2:0:0:0: [sda] 1465149168 512-byte logical blocks: (750 GB/698 GiB) [ 1.760415] sd 2:0:0:0: [sda] Write Protect is off [ 1.760417] sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00 [ 1.760420] scsi 3:0:0:0: Direct-Access ATA ST3750528AS CC38 PQ: 0 ANSI: 5 [ 1.760436] sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO o r FUA [ 1.760505] sd 3:0:0:0: [sdb] 1465149168 512-byte logical blocks: (750 GB/698 GiB) [ 1.760533] sd 3:0:0:0: [sdb] Write Protect is off [ 1.760535] sd 3:0:0:0: [sdb] Mode Sense: 00 3a 00 00 [ 1.760544] sd 3:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126
On Tue, Sep 27, 2011 at 3:27 PM, Andreas Jaeger <aj@suse.com> wrote:
I see on my system (running 3.1 rc6) quite a lot of these in dmesg (complete dmesg attached): [27907.047446] ata3.00: exception Emask 0x40 SAct 0x1 SErr 0x880800 action 0x6 frozen [27907.047453] ata3: SError: { HostInt 10B8B LinkSeq }
I don't see any media errors, so your platter and disk head seem fine. The first thing I'd do is replace your sata cables. They don't cost much and they do go bad. fyi: if you swap out drives often, you should know that a sata cable is rated for 20 insertions if I recall right. Yes, 20 not 20,000. eSata is rated for far more. In our lab we swap out drives all the time, so we routinely see sata cable failures. And in a general sense you just see weird errors like the above when they go bad.
[27907.047457] ata3.00: failed command: READ FPDMA QUEUED [27907.047463] ata3.00: cmd 60/08:00:f0:1d:53/00:00:0b:00:00/40 tag 0 ncq 4096 in [27907.047465] res 40/00:00:40:fb:8c/00:00:14:00:00/40 Emask 0x44 (timeout)
From here on down, everything is normal / good.
[27907.047468] ata3.00: status: { DRDY } [27907.047474] ata3: hard resetting link [27907.351451] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [27907.353803] ata3.00: configured for UDMA/33 [27907.364380] ata3: EH complete
Is my harddisk dying? I have two harddisks - which ones of these is it (sda or sdb)? This part of dmesg did not help me to identify which disk is ata3 and which ata4:
Just replace both cables and I bet you're fine. I always replace ours in pairs. Greg -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On 09/27/2011 10:10 PM, Greg Freemyer wrote:
On Tue, Sep 27, 2011 at 3:27 PM, Andreas Jaeger <aj@suse.com> wrote:
I see on my system (running 3.1 rc6) quite a lot of these in dmesg (complete dmesg attached): [27907.047446] ata3.00: exception Emask 0x40 SAct 0x1 SErr 0x880800 action 0x6 frozen [27907.047453] ata3: SError: { HostInt 10B8B LinkSeq }
I don't see any media errors, so your platter and disk head seem fine.
The first thing I'd do is replace your sata cables. They don't cost much and they do go bad.
fyi: if you swap out drives often, you should know that a sata cable is rated for 20 insertions if I recall right.
Yes, 20 not 20,000.
eSata is rated for far more.
In our lab we swap out drives all the time, so we routinely see sata cable failures. And in a general sense you just see weird errors like the above when they go bad.
[27907.047457] ata3.00: failed command: READ FPDMA QUEUED [27907.047463] ata3.00: cmd 60/08:00:f0:1d:53/00:00:0b:00:00/40 tag 0 ncq 4096 in [27907.047465] res 40/00:00:40:fb:8c/00:00:14:00:00/40 Emask 0x44 (timeout)
From here on down, everything is normal / good.
[27907.047468] ata3.00: status: { DRDY } [27907.047474] ata3: hard resetting link [27907.351451] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 310) [27907.353803] ata3.00: configured for UDMA/33 [27907.364380] ata3: EH complete
Is my harddisk dying? I have two harddisks - which ones of these is it (sda or sdb)? This part of dmesg did not help me to identify which disk is ata3 and which ata4:
Just replace both cables and I bet you're fine. I always replace ours in pairs.
Greg
then after that I would recommend to run a full smart diagnostic smartctl -t long /dev/sda smartctl -t long /dev/sdb then go to sleep, and check result the next morning! smartctl -a /dev/sda and look at suspicious Current_Pending_Sector not being 0 -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Ambassador GPG KEY : D5C9B751C4653227 irc: tigerfoot -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On Tuesday, September 27, 2011 10:10:35 PM Greg Freemyer wrote:
On Tue, Sep 27, 2011 at 3:27 PM, Andreas Jaeger <aj@suse.com> wrote: [...] Just replace both cables and I bet you're fine. I always replace ours in pairs.
Will try tomorrow (I'm assessing the machine remotely right now), thanks for the tip. Now I see also: [29589.557267] sd 2:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [29589.557273] sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] [descriptor] [29589.557278] Descriptor sense data with sense descriptors (in hex): [29589.557280] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [29589.557290] 0d a2 5b 10 [29589.557294] sd 2:0:0:0: [sda] Add. Sense: No additional sense information [29589.557298] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 0d a2 67 10 00 04 00 00 [29589.557307] end_request: I/O error, dev sda, sector 228747024 [29589.557312] Buffer I/O error on device sda6, logical block 4473570 [29589.557317] Buffer I/O error on device sda6, logical block 4473571 [29589.557320] Buffer I/O error on device sda6, logical block 4473572 Andreas -- Andreas Jaeger aj@{suse.com,opensuse.org} Twitter/Identica: jaegerandi SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg, Germany GF: Jeff Hawn, Jennifer Guild, Felix Imendörffer, HRB 16746 (AG Nürnberg) GPG fingerprint = 93A3 365E CE47 B889 DF7F FED1 389A 563C C272 A126 -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On 09/27/2011 10:56 PM, Andreas Jaeger wrote:
On Tuesday, September 27, 2011 10:10:35 PM Greg Freemyer wrote:
On Tue, Sep 27, 2011 at 3:27 PM, Andreas Jaeger <aj@suse.com> wrote: [...] Just replace both cables and I bet you're fine. I always replace ours in pairs.
Will try tomorrow (I'm assessing the machine remotely right now), thanks for the tip.
Now I see also: [29589.557267] sd 2:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE [29589.557273] sd 2:0:0:0: [sda] Sense Key : Aborted Command [current] [descriptor] [29589.557278] Descriptor sense data with sense descriptors (in hex): [29589.557280] 72 0b 00 00 00 00 00 0c 00 0a 80 00 00 00 00 00 [29589.557290] 0d a2 5b 10 [29589.557294] sd 2:0:0:0: [sda] Add. Sense: No additional sense information [29589.557298] sd 2:0:0:0: [sda] CDB: Write(10): 2a 00 0d a2 67 10 00 04 00 00 [29589.557307] end_request: I/O error, dev sda, sector 228747024 [29589.557312] Buffer I/O error on device sda6, logical block 4473570 [29589.557317] Buffer I/O error on device sda6, logical block 4473571 [29589.557320] Buffer I/O error on device sda6, logical block 4473572
Andreas
ok that shit can happen. But should not reproduce too much. otherwise sata drive that are hardly solicited (imap etc) should not be used more than 3 years. Then a smartctl -t long test could resolve the case. Then after do a cold reboot with full power off / on. -- Bruno Friedmann Ioda-Net Sàrl www.ioda-net.ch openSUSE Member & Ambassador GPG KEY : D5C9B751C4653227 irc: tigerfoot -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
On 27/09/11 16:27, Andreas Jaeger wrote:
Is my harddisk dying?
Though I believe your drive is likely dying, the drive's firmware is outdated and may have bugs.
[ 1.760224] scsi 2:0:0:0: Direct-Access ATA ST3750528AS CC38 PQ: 0 ANSI: 5
Current version is CC49 : http://seagate.custkb.com/seagate/crm/selfservice/search.jsp?DocId=213891&NewLang=en -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org For additional commands, e-mail: opensuse-kernel+help@opensuse.org
participants (4)
-
Andreas Jaeger
-
Bruno Friedmann
-
Cristian Rodríguez
-
Greg Freemyer