[opensuse] SATA weirdness
I've been having fun and games at home. Short story first. I have a fairly old system and one of the SATA drives is failing so I bought a replacement. The system didn't see it at all so I rushed out and bought another drive (Xmas!) but it wasn't seen either. Google told me it was a problem with the SATA chip on the motherboard, so I borrowed a SATA adapter. It kind of worked but gave lots of bus errors. I tried a couple of other adapters with similar results. I tested the drives and they're perfect. I read about power issues with SATA so I disconnected everything I could but that made no difference. So I started looking for a new system. Then comes the weird part. I've been running openSUSE 11.2 and have also tested Ubuntu 10.04 with similar results. Last night I tried Knoppix 6.4.3 and apparently everything worked perfectly. I still need to do more testing but maybe my old system can live a while longer. I'd be interested in any views people have about the prognosis. OK, now here's the details: The mobo is an MSI K8M Neo-V, which has two SATA 1.5 Gbps ports controlled by a VIA VT6420 chip (it has a broken design that can't see more recent drives). The failing drive is a Seagate 1.5 Gbps. The new drives are both Samsung 3 Gbps SATA drives; a 1 TB HD103SJ and a 320 GB. The smaller drive has a jumper to force 1.5 Gbps speed, while the larger one uses a software utility. I borrowed a PCI adapter based on the Sil 3512 and I've bought one based on the VIA VT6421A. Like all PCI SATA adapters, they're limited to 1.5 Gbps. The output from lspci (with the Sil controller) looked like this: 00:00.0 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.1 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.2 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.3 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.4 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.7 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 South] 00:0b.0 Mass storage controller: Silicon Image, Inc. SiI 3512 [SATALink/SATARaid] Serial ATA Controller (rev 01) 00:0c.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller (Link) 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] 00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60) 00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: nVidia Corporation NV44A [GeForce 6200] (rev a1) Typical error messages were like this: Jan 5 22:53:27 piglet kernel: [ 157.040095] ata5: hard resetting link Jan 5 22:53:27 piglet kernel: [ 157.390039] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:53:32 piglet kernel: [ 162.390035] ata5: hard resetting link Jan 5 22:53:33 piglet kernel: [ 162.740037] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:53:33 piglet kernel: [ 162.780287] ata5.00: configured for UDMA/100 Jan 5 22:53:33 piglet kernel: [ 162.780294] ata5.00: device reported invalid CHS sector 0 Jan 5 22:53:33 piglet kernel: [ 162.780302] ata5: EH complete Jan 5 22:54:03 piglet kernel: [ 193.040089] ata5: hard resetting link Jan 5 22:54:03 piglet kernel: [ 193.390060] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:54:03 piglet kernel: [ 193.430287] ata5.00: configured for UDMA/100 Jan 5 22:54:03 piglet kernel: [ 193.430295] ata5.00: device reported invalid CHS sector 0 Jan 5 22:54:03 piglet kernel: [ 193.430308] ata5: EH complete Jan 5 22:54:07 piglet kernel: [ 197.042033] ata5.00: limiting speed to UDMA/66:PIO4 Jan 5 22:54:07 piglet kernel: [ 197.042070] ata5: hard resetting link Jan 5 22:54:07 piglet kernel: [ 197.390059] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:54:07 piglet kernel: [ 197.430288] ata5.00: configured for UDMA/66 Jan 5 22:54:07 piglet kernel: [ 197.430305] ata5: EH complete Jan 5 22:54:08 piglet kernel: [ 197.821413] ata5.00: configured for UDMA/66 Jan 5 22:54:08 piglet kernel: [ 197.821437] ata5: EH complete Jan 5 22:54:38 piglet kernel: [ 228.040099] ata5: hard resetting link Jan 5 22:54:38 piglet kernel: [ 228.390046] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:54:38 piglet kernel: [ 228.430286] ata5.00: configured for UDMA/66 Jan 5 22:54:38 piglet kernel: [ 228.430309] ata5: EH complete You can see that it steadily reduces the bus speed. With the VIA adapter, lspci shows: 00:00.0 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.1 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.2 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.3 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.4 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:00.7 Host bridge: VIA Technologies, Inc. K8M800 Host Bridge 00:01.0 PCI bridge: VIA Technologies, Inc. VT8237 PCI bridge [K8T800/K8T890 South] 00:0b.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID Controller (rev 50) 00:0c.0 FireWire (IEEE 1394): Texas Instruments TSB12LV26 IEEE-1394 Controller (Link) 00:0f.0 RAID bus controller: VIA Technologies, Inc. VIA VT6420 SATA RAID Controller (rev 80) 00:0f.1 IDE interface: VIA Technologies, Inc. VT82C586A/B/VT82C686/A/B/VT823x/A/C PIPC Bus Master IDE (rev 06) 00:10.0 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.1 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.2 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.3 USB Controller: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (rev 81) 00:10.4 USB Controller: VIA Technologies, Inc. USB 2.0 (rev 86) 00:11.0 ISA bridge: VIA Technologies, Inc. VT8237 ISA bridge [KT600/K8T800/K8T890 South] 00:11.5 Multimedia audio controller: VIA Technologies, Inc. VT8233/A/8235/8237 AC97 Audio Controller (rev 60) 00:12.0 Ethernet controller: VIA Technologies, Inc. VT6102 [Rhine-II] (rev 78) 00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration 00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map 00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller 00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control 01:00.0 VGA compatible controller: nVidia Corporation NV44A [GeForce 6200] (rev a1) and error messages looked like this: Jan 12 20:51:18 piglet kernel: [ 109.441492] ata2.00: exception Emask 0x12 SAct 0x0 SErr 0x1000500 action 0x6 Jan 12 20:51:18 piglet kernel: [ 109.441517] ata2.00: BMDMA stat 0x5 Jan 12 20:51:18 piglet kernel: [ 109.441525] ata2: SError: { UnrecovData Proto TrStaTrns } Jan 12 20:51:18 piglet kernel: [ 109.441539] ata2.00: cmd c8/00:f0:58:05:57/00:00:00:00:00/e1 tag 0 dma 122880 in Jan 12 20:51:18 piglet kernel: [ 109.441541] res 51/84:48:00:00:00/84:58:00:00:00/e0 Emask 0x12 (ATA bus error) Jan 12 20:51:18 piglet kernel: [ 109.441555] ata2.00: status: { DRDY ERR } Jan 12 20:51:18 piglet kernel: [ 109.441561] ata2.00: error: { ICRC ABRT } Jan 12 20:51:18 piglet kernel: [ 109.441575] ata2: hard resetting link Jan 12 20:51:18 piglet kernel: [ 109.746050] ata2: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 12 20:51:18 piglet kernel: [ 109.784313] ata2.00: configured for UDMA/33 Jan 12 20:51:18 piglet kernel: [ 109.784337] ata2: EH complete The errors didn't seem to cause any data corruption. Oh and the kernel versions are: openSUSE 11.2 2.6.31 ubuntu 10.04 2.6.32 knoppix 6.4.3 2.6.36 Has some kernel or driver update improved SATA handling significantly? Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, On Wed, 19 Jan 2011, Dave Howorth wrote: [..]
Jan 5 22:53:33 piglet kernel: [ 162.780294] ata5.00: device reported invalid CHS sector 0 Jan 5 22:53:33 piglet kernel: [ 162.780302] ata5: EH complete Jan 5 22:54:03 piglet kernel: [ 193.040089] ata5: hard resetting link Jan 5 22:54:03 piglet kernel: [ 193.390060] ata5: SATA link up 1.5 Gbps (SStatus 113 SControl 310) Jan 5 22:54:03 piglet kernel: [ 193.430287] ata5.00: configured for UDMA/100 Jan 5 22:54:03 piglet kernel: [ 193.430295] ata5.00: device reported invalid CHS sector 0 Jan 5 22:54:03 piglet kernel: [ 193.430308] ata5: EH complete Jan 5 22:54:07 piglet kernel: [ 197.042033] ata5.00: limiting speed to UDMA/66:PIO4
I've been seen the 'CHS sector 0' on an external eSATA drive (where the drive is simply slotted in[1]), but speed stayed up with that. I have the reducing of the speed with un-DVDs, i.e. discs with defect sectors, where the drive (IDE btw) needs minutes to read and "fail" the sector. With internal drives, I'd recommend checking / replacing the SATA-Cables (best those with clips). HTH, -dnh [1] Sharkoon QuikDock -- Nolte: 'Sex im Internet erst nach 23 Uhr!' Rückfrage: 'Welche Zeitzone?' Nolte: 'Wie? Welche Zeitzone?' -- Typisch Katholiken. Die glauben immer noch, daß die Erde eine Scheibe ist. Klar gibt es dann nur eine Zeitzone." -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David Haller wrote:
I've been seen the 'CHS sector 0' on an external eSATA drive (where the drive is simply slotted in[1]), but speed stayed up with that.
I have the reducing of the speed with un-DVDs, i.e. discs with defect sectors, where the drive (IDE btw) needs minutes to read and "fail" the sector.
With internal drives, I'd recommend checking / replacing the SATA-Cables (best those with clips).
Thanks for the feedback, David. I should have mentioned that I've tried multiple cables of different designs both for the data and the power cable, and I've been careful to keep the connectors aligned. Also, since all the problems seem to go away when I run Knoppix, it seems like it may not be a hardware problem at all now! That's why I called it weird :) Must do more testing. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 01/19/2011 11:02 AM, Dave Howorth wrote:
David Haller wrote:
I've been seen the 'CHS sector 0' on an external eSATA drive (where the drive is simply slotted in[1]), but speed stayed up with that.
I have the reducing of the speed with un-DVDs, i.e. discs with defect sectors, where the drive (IDE btw) needs minutes to read and "fail" the sector.
With internal drives, I'd recommend checking / replacing the SATA-Cables (best those with clips).
Thanks for the feedback, David. I should have mentioned that I've tried multiple cables of different designs both for the data and the power cable, and I've been careful to keep the connectors aligned.
Also, since all the problems seem to go away when I run Knoppix, it seems like it may not be a hardware problem at all now! That's why I called it weird :)
Must do more testing. Cheers, Dave
Dave, This sounds like a kernel issue. It may be worth subscribing for a week to get this issue out there and find out what the possible kernel causes are that could result in the different behavior you see among the different distro kernels. You can find the info for the mailing list at: http://www.kernel.org/pub/linux/docs/lkml/ IT IS A HIGH VOLUME LIST. So you will want to set up a folder and filter for incoming messages. Filtering on 'From, To, Cc or Bcc' with 'kernel.org' works fine. The only other thought I have is "Have you tried setting the jumper to limit throughput to 1.5 Gbps?" I know you said it was a 1.5 Gbps drive, but I have seen similar behavior when I have omitted the 1.5 Gpbs jumper on Seagate drives on older systems. Good luck. -- David C. Rankin, J.D.,P.E. Rankin Law Firm, PLLC 510 Ochiltree Street Nacogdoches, Texas 75961 Telephone: (936) 715-9333 Facsimile: (936) 715-9339 www.rankinlawfirm.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
David C. Rankin wrote:
On 01/19/2011 11:02 AM, Dave Howorth wrote:
David Haller wrote:
I've been seen the 'CHS sector 0' on an external eSATA drive (where the drive is simply slotted in[1]), but speed stayed up with that.
I have the reducing of the speed with un-DVDs, i.e. discs with defect sectors, where the drive (IDE btw) needs minutes to read and "fail" the sector.
With internal drives, I'd recommend checking / replacing the SATA-Cables (best those with clips).
Thanks for the feedback, David. I should have mentioned that I've tried multiple cables of different designs both for the data and the power cable, and I've been careful to keep the connectors aligned.
Also, since all the problems seem to go away when I run Knoppix, it seems like it may not be a hardware problem at all now! That's why I called it weird :)
Must do more testing.
This sounds like a kernel issue. It may be worth subscribing for a week to get this issue out there and find out what the possible kernel causes are that could result in the different behavior you see among the different distro kernels. You can find the info for the mailing list at:
http://www.kernel.org/pub/linux/docs/lkml/
IT IS A HIGH VOLUME LIST. So you will want to set up a folder and filter for incoming messages. Filtering on 'From, To, Cc or Bcc' with 'kernel.org' works fine.
The only other thought I have is "Have you tried setting the jumper to limit throughput to 1.5 Gbps?" I know you said it was a 1.5 Gbps drive, but I have seen similar behavior when I have omitted the 1.5 Gpbs jumper on Seagate drives on older systems.
Just to let everybody know, this did indeed turn out to be a software problem. Or rather there's a kernel patch that fixes a hardware incompatibility. The VIA 6241-based controller that I bought works with current generation Samsung (and WD) disks only with recent kernels. Specifically, those where a patch that rejoices in the name "Joseph Chan's magic patch" has been applied. I joined the linux-ide list instead of the kernel list (much lower volume!) and with Tejun's help, I managed to sort it out. Cheers, Dave -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (3)
-
Dave Howorth
-
David C. Rankin
-
David Haller