Mailinglist Archive: opensuse-kernel (78 mails)

< Previous Next >
Re: [opensuse-kernel] ATA errors
Hi Felix,

於 六,2012-04-14 於 13:55 -0400,Jeff Mahoney 提到:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 04/14/2012 02:00 AM, Felix Miata wrote:
http://fm.no-ip.com/Tmp/Linux/messages-gx280-esata.txt (32bit 12.1
kernel-desktop-3.1.9)

There have write DMA timeout:

Apr 13 13:54:50 gx280 smartd[1613]: Device: /dev/sdc [SAT], SMART Usage
Attribute: 195 Hardware_ECC_Recovered changed from 25 to 30
Apr 13 14:16:48 gx280 kernel: [ 4975.072049] ata5.00: exception Emask 0x0 SAct
0x0 SErr 0x0 action 0x6 frozen
Apr 13 14:16:48 gx280 kernel: [ 4975.072056] ata5.00: failed command: WRITE DMA
EXT
Apr 13 14:16:48 gx280 kernel: [ 4975.072065] ata5.00: cmd
35/00:08:48:68:a8/00:00:ae:00:00/e0 tag 0 dma 4096 out
Apr 13 14:16:48 gx280 kernel: [ 4975.072067] res
40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr 13 14:16:48 gx280 kernel: [ 4975.072071] ata5.00: status: { DRDY }
Apr 13 14:16:48 gx280 kernel: [ 4975.072082] ata5: hard resetting link
Apr 13 14:16:49 gx280 kernel: [ 4975.377042] ata5: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
Apr 13 14:16:49 gx280 kernel: [ 4975.405487] ata5.00: configured for UDMA/100
Apr 13 14:16:49 gx280 kernel: [ 4975.405502] ata5: EH complete

It causes machine pending until timeout.

http://fm.no-ip.com/Tmp/Linux/messages-gx620-esata.txt (32bit 12.1
kernel-desktop-3.1.9)

Have READ DMA timeout:

Apr 13 15:18:38 vizio kernel: [ 265.732111] EXT4-fs (sdb2): mounted
filesystem without journal. Opts: (null)
Apr 13 15:29:41 vizio kernel: [ 929.056055] ata5.00: exception Emask 0x0 SAct
0x0 SErr 0x0 action 0x6 frozen
Apr 13 15:29:41 vizio kernel: [ 929.056063] ata5.00: failed command: READ DMA
Apr 13 15:29:41 vizio kernel: [ 929.056071] ata5.00: cmd
c8/00:08:18:38:00/00:00:00:00:00/e0 tag 0 dma 4096 in
Apr 13 15:29:41 vizio kernel: [ 929.056073] res
40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)
Apr 13 15:29:41 vizio kernel: [ 929.056077] ata5.00: status: { DRDY }
Apr 13 15:29:41 vizio kernel: [ 929.056089] ata5: hard resetting link
Apr 13 15:29:42 vizio kernel: [ 929.361042] ata5: SATA link up 1.5 Gbps
(SStatus 113 SControl 310)
Apr 13 15:29:42 vizio kernel: [ 929.405441] ata5.00: configured for UDMA/100
Apr 13 15:29:42 vizio kernel: [ 929.405456] ata5: EH complete


Is there anything in these /var/log/messages excerpts from two
different hosts that can identify which (external) sata device is
connected with the reported errors? I don't see any connection
between that information and the content of
/dev/disk/[by-path,by-id]. Is there a way to tell if a kernel bug
or a device is the problem? These errors show up in conjunction
with various degrees of I/O delay, sometimes as bad as several
minutes of inability to get bash to respond to keyboard input. The
most recent time some commands would result in a simple
segfaulted. Others would produce bash: input/output error.
Ctrl-Alt-Del failed to reboot, as did init 6 (input/output error),
forcing me to use power switch and wait a seeming eternity for fsck
on 1TB+ partitions to complete.

You'll find that info earlier in the log. e.g. on my system:
[ 1.290219] ata1.00: ATA-8: OCZ-VERTEX2, 1.35, max UDMA/133

Alternatively, lsscsi -v will show:
[0:0:0:0] disk ATA OCZ-VERTEX2 1.35 /dev/sda
dir: /sys/bus/scsi/devices/0:0:0:0
[/sys/devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0]

... where you can see the 'ata1' in the sysfs path.


Yes, please use lsscsi to check the ata5:00 (5 is ata port id) is your
external eSATA box.

The problem with that type of error is that it's tough to isolate what
the problem is without detailed hardware knowledge (the type of
knowledge I don't personally have).

Joey's team has more experience with ATA issues. Perhaps they can assist.

- -Jeff

Currently, no idea why the command timeout when host try to write/read
device through DMA.

Did you try to use this eSATA box on other OS?

And,
Please kindly file a bug on https://bugzilla.novell.com and attached on
your whole /var/log/messages, /var/log/boot.msg and lsmod result.

I will investigate it.


Thanks a lot!
Joey Lee


I've run the Seagate thorough diagnostics on the drives, 100% error
free.

The suspect device is
http://www.newegg.com/Product/Product.aspx?Item=N82E16817173042 but
I want to provide Rosewill tech support hard evidence of the
incompetence of the device if I can be sure it is exclusively
responsible for the errors. I bought 5 of them and never had
trouble. Only with #6, the only version 2 of the model I have, has
there been a problem. The #6 I have now is actually the 3rd
replacement of the original #6 purchased last September.


- --
Jeff Mahoney
SUSE Labs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.18 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBAgAGBQJPiboTAAoJEB57S2MheeWyz7UP/3dU1aCuJu/hs+d0W9dmCojN
/1455tvjXYCKIX5GurvJXgt6pHUlheWgpbWVHgORUfB6BOWOnkWyBwY7vXsdF+TR
QNxI0Q/IcvcicrFL2kQe66+GOGZoatttb7KhAb0XXiVrKfyzAiHgl7LCGem7clXQ
yWJ2Nm0os9yZ+iYMelVA0SIDRB/1fSMvAy+L8UdwpIv6c4mvwqj+Vo/oi0GB8xvm
LQZFQ5i9sAj00V/hBQxb12WnfNDIZvrW+FXR4uufSYtzaVzRoq7Ui6cjKW4mREnI
o6p7td+wbEaL397mO9zHKGmkURWRW3qwfoLtq4DJyhcmvzsWhn2KsqNeXrHG5KAD
MpaQLE+xuKmE/egxCE9TSm+Eks+pNhocugKd3uJSI4P1164oHFBlKycQAOpQvWcH
31v6LRCX08vfRHyL/4diAqw249fPWM1ONR+mNnbzuZffmDz5m0GsVdK2oY1b93Hd
Qa4QMIFvYDfGamlH8frlc/9sIVXBSyBQDHrLeRbuGz2PAxmYJQPAbP6+eV1Jv82/
9z0UMCLlK0gr7PD4p3v+kVEXpwTz8sPEPaA4zejrfP5H1+egVDA+v6SknLpmEdVy
qfU4D0ru/9AaJckpA7KtgyXexkDcb0znHWwsy71LxSQPeSuV9arEj7VDpcZw2mJ+
QWOfKrzGhUiLxjU3oauP
=sak/
-----END PGP SIGNATURE-----


--
To unsubscribe, e-mail: opensuse-kernel+unsubscribe@xxxxxxxxxxxx
To contact the owner, e-mail: opensuse-kernel+owner@xxxxxxxxxxxx

< Previous Next >
Follow Ups