Hi Felix, 於 六,2012-04-14 於 13:55 -0400,Jeff Mahoney 提到:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04/14/2012 02:00 AM, Felix Miata wrote:
http://fm.no-ip.com/Tmp/Linux/messages-gx280-esata.txt (32bit 12.1 kernel-desktop-3.1.9)
There have write DMA timeout: Apr 13 13:54:50 gx280 smartd[1613]: Device: /dev/sdc [SAT], SMART Usage Attribute: 195 Hardware_ECC_Recovered changed from 25 to 30 Apr 13 14:16:48 gx280 kernel: [ 4975.072049] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Apr 13 14:16:48 gx280 kernel: [ 4975.072056] ata5.00: failed command: WRITE DMA EXT Apr 13 14:16:48 gx280 kernel: [ 4975.072065] ata5.00: cmd 35/00:08:48:68:a8/00:00:ae:00:00/e0 tag 0 dma 4096 out Apr 13 14:16:48 gx280 kernel: [ 4975.072067] res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 13 14:16:48 gx280 kernel: [ 4975.072071] ata5.00: status: { DRDY } Apr 13 14:16:48 gx280 kernel: [ 4975.072082] ata5: hard resetting link Apr 13 14:16:49 gx280 kernel: [ 4975.377042] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Apr 13 14:16:49 gx280 kernel: [ 4975.405487] ata5.00: configured for UDMA/100 Apr 13 14:16:49 gx280 kernel: [ 4975.405502] ata5: EH complete It causes machine pending until timeout.
http://fm.no-ip.com/Tmp/Linux/messages-gx620-esata.txt (32bit 12.1 kernel-desktop-3.1.9)
Have READ DMA timeout: Apr 13 15:18:38 vizio kernel: [ 265.732111] EXT4-fs (sdb2): mounted filesystem without journal. Opts: (null) Apr 13 15:29:41 vizio kernel: [ 929.056055] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen Apr 13 15:29:41 vizio kernel: [ 929.056063] ata5.00: failed command: READ DMA Apr 13 15:29:41 vizio kernel: [ 929.056071] ata5.00: cmd c8/00:08:18:38:00/00:00:00:00:00/e0 tag 0 dma 4096 in Apr 13 15:29:41 vizio kernel: [ 929.056073] res 40/00:01:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout) Apr 13 15:29:41 vizio kernel: [ 929.056077] ata5.00: status: { DRDY } Apr 13 15:29:41 vizio kernel: [ 929.056089] ata5: hard resetting link Apr 13 15:29:42 vizio kernel: [ 929.361042] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Apr 13 15:29:42 vizio kernel: [ 929.405441] ata5.00: configured for UDMA/100 Apr 13 15:29:42 vizio kernel: [ 929.405456] ata5: EH complete
Is there anything in these /var/log/messages excerpts from two different hosts that can identify which (external) sata device is connected with the reported errors? I don't see any connection between that information and the content of /dev/disk/[by-path,by-id]. Is there a way to tell if a kernel bug or a device is the problem? These errors show up in conjunction with various degrees of I/O delay, sometimes as bad as several minutes of inability to get bash to respond to keyboard input. The most recent time some commands would result in a simple segfaulted. Others would produce bash: input/output error. Ctrl-Alt-Del failed to reboot, as did init 6 (input/output error), forcing me to use power switch and wait a seeming eternity for fsck on 1TB+ partitions to complete.
You'll find that info earlier in the log. e.g. on my system: [ 1.290219] ata1.00: ATA-8: OCZ-VERTEX2, 1.35, max UDMA/133
Alternatively, lsscsi -v will show: [0:0:0:0] disk ATA OCZ-VERTEX2 1.35 /dev/sda dir: /sys/bus/scsi/devices/0:0:0:0 [/sys/devices/pci0000:00/0000:00:11.0/ata1/host0/target0:0:0/0:0:0:0]
... where you can see the 'ata1' in the sysfs path.
Yes, please use lsscsi to check the ata5:00 (5 is ata port id) is your external eSATA box.
The problem with that type of error is that it's tough to isolate what the problem is without detailed hardware knowledge (the type of knowledge I don't personally have).
Joey's team has more experience with ATA issues. Perhaps they can assist.
- -Jeff
Currently, no idea why the command timeout when host try to write/read device through DMA. Did you try to use this eSATA box on other OS? And, Please kindly file a bug on https://bugzilla.novell.com and attached on your whole /var/log/messages, /var/log/boot.msg and lsmod result. I will investigate it. Thanks a lot! Joey Lee
I've run the Seagate thorough diagnostics on the drives, 100% error free.
The suspect device is http://www.newegg.com/Product/Product.aspx?Item=N82E16817173042 but I want to provide Rosewill tech support hard evidence of the incompetence of the device if I can be sure it is exclusively responsible for the errors. I bought 5 of them and never had trouble. Only with #6, the only version 2 of the model I have, has there been a problem. The #6 I have now is actually the 3rd replacement of the original #6 purchased last September.
- -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.18 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/
iQIcBAEBAgAGBQJPiboTAAoJEB57S2MheeWyz7UP/3dU1aCuJu/hs+d0W9dmCojN /1455tvjXYCKIX5GurvJXgt6pHUlheWgpbWVHgORUfB6BOWOnkWyBwY7vXsdF+TR QNxI0Q/IcvcicrFL2kQe66+GOGZoatttb7KhAb0XXiVrKfyzAiHgl7LCGem7clXQ yWJ2Nm0os9yZ+iYMelVA0SIDRB/1fSMvAy+L8UdwpIv6c4mvwqj+Vo/oi0GB8xvm LQZFQ5i9sAj00V/hBQxb12WnfNDIZvrW+FXR4uufSYtzaVzRoq7Ui6cjKW4mREnI o6p7td+wbEaL397mO9zHKGmkURWRW3qwfoLtq4DJyhcmvzsWhn2KsqNeXrHG5KAD MpaQLE+xuKmE/egxCE9TSm+Eks+pNhocugKd3uJSI4P1164oHFBlKycQAOpQvWcH 31v6LRCX08vfRHyL/4diAqw249fPWM1ONR+mNnbzuZffmDz5m0GsVdK2oY1b93Hd Qa4QMIFvYDfGamlH8frlc/9sIVXBSyBQDHrLeRbuGz2PAxmYJQPAbP6+eV1Jv82/ 9z0UMCLlK0gr7PD4p3v+kVEXpwTz8sPEPaA4zejrfP5H1+egVDA+v6SknLpmEdVy qfU4D0ru/9AaJckpA7KtgyXexkDcb0znHWwsy71LxSQPeSuV9arEj7VDpcZw2mJ+ QWOfKrzGhUiLxjU3oauP =sak/ -----END PGP SIGNATURE-----
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org