[opensuse] Strange SATA problems with openSUSE
I've posted a couple times about this with no replies yet.... Earlier today, the entire computer came crashing to a halt... so it forced me to spend more time looking into the problem. The motherboard I have (ASUS M2N-e SLI) has 4 SATA2 ports. SATA 1, 2, 3 and 4. I also have a SATA1 RAID controller with 2 SATA ports. I have drives connected on IDE0 and IDE1 and they are working fine. Scenario 1: If I leave the RAID card out, and just connect drives to SATA 1 and SATA 2.... the computer boots fine. BIOS finds the SATA drives, and Linux is happy. Scenario 2: If I add drives to SATA 3 and 4 in Scenario 1, the BIOS sees all four drive2, but when I boot Linux, it errors out. I can boot the OS, but the error logs fill up with errors, and I have serious performance issues.. until it just dies altogether. The boot errors look like this: ----------------- <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <4>ata3: failed to recover some devices, retrying in 5 secs <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <3>ata3.00: revalidation failed (errno=-5) <4>ata3: limiting SATA link speed to 1.5 Gbps <4>ata3.00: limiting speed to UDMA7:PIO5 <4>ata3: failed to recover some devices, retrying in 5 secs ------------------ and continue on for quite some time. Scenario 3: If I add the RAID card in to Scenario 1, but do not connect any drives to the RAID, all boots and works OK. Scenario 4: If I connect 2 SATA drives to the RAID card, and have two drives from Scenario 1 also connected, all works and boots OK. Scenario 5: If I connect a SATA drive to SATA 3 or 4 in Scenario 4, I get the same results as with Scenario 2... a long list of SATA errors on the boot. Has anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing? C. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 2008/01/25 15:43 (GMT+0100) Clayton apparently typed:
I've posted a couple times about this with no replies yet....
I think this is a question that needs to be asked on a different mailing list: linux-ide@vger.kernel.org http://vger.kernel.org/vger-lists.html -- "In the beginning was the Word, and the Word was with God, and the Word was God." John 1:1 NIV Team OS/2 ** Reg. Linux User #211409 Felix Miata *** http://mrmazda.no-ip.com/ -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
<snipped all> I did not follow all the scenarios, but SATA is very, very picky about power supply issues. And has been the cause of about 50% of the reported issues on linux-ide. If all is good with a couple disks, and then starts failing with more, I would expect the PS to be the problem. There is good news: The Sata cable does not have a ground line, so you can power the drives from a different source than the rest of the computer without fear of ground loops. (A big issue normally.) Quoting from the linux-ide list: * If you have an extra power supply lying around, connecting some of SATA devices to a separate PSU (don't do it for PATA) and seeing whether the problem continues and on which devices is a great way to rule out power problem. You can power up a PSU without connecting it to a system by... http://modtown.co.uk/mt/article2.php?id=psumod HTH Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
If all is good with a couple disks, and then starts failing with more, I would expect the PS to be the problem.
True, but.... I can have two drives connected to SATA 1 and 2, and a third drive connected to SATA 3 sends things for a loop. If I do not connect anything to SATA 3 and 4,instead connecting 2 drives to the motherboard and 2 drives to my 3rd party RAID card... everything works fine... so with 3 drives I get failures, but with 4 it's fine as long as the 3rd and 4th are not plugged into the motherboard SATA controller. I have a fairly new 600W PSU from BeQuiet in the case... so there should be enough power for all devices. I have done some more digging, and it might be related to an obscure problem with the sata_nv kernel module. Some people are reporting similar problems.. not identical, but similar. I found most of the info via a long search through the mailing list archives at kernel.org (thanks for the pointer Felix). It only seems to affect some people though... and I am not clear yet what exactly is going wrong... or if there is a fix or patch to clear it up anywhere. Still looking.
* If you have an extra power supply lying around, connecting some of SATA devices to a separate PSU (don't do it for PATA) and seeing whether the problem continues and on which devices is a great way to rule out power problem. You can power up a PSU without connecting it to a system by...
I will give that a try and see. I think I have a spare 400W PSU somewhere... C -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Jan 25, 2008 1:08 PM, Clayton <smaug42@gmail.com> wrote:
If all is good with a couple disks, and then starts failing with more, I would expect the PS to be the problem.
True, but.... I can have two drives connected to SATA 1 and 2, and a third drive connected to SATA 3 sends things for a loop. If I do not connect anything to SATA 3 and 4,instead connecting 2 drives to the motherboard and 2 drives to my 3rd party RAID card... everything works fine... so with 3 drives I get failures, but with 4 it's fine as long as the 3rd and 4th are not plugged into the motherboard SATA controller.
I have a fairly new 600W PSU from BeQuiet in the case... so there should be enough power for all devices.
Bigger PSUs are often harder to work with. I believe they tend to have multiple separate power subsystems. If you are indiscriminate about which connectors you use you can overload one subsystem while the overall unit is just chugging along fine. I think they call each subsystem a lane? The old classic 450W was just one big system, so all the connectors were effectively equivalent. Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Jan 25, 2008 1:37 PM, Greg Freemyer <greg.freemyer@gmail.com> wrote:
On Jan 25, 2008 1:08 PM, Clayton <smaug42@gmail.com> wrote:
If all is good with a couple disks, and then starts failing with more, I would expect the PS to be the problem.
True, but.... I can have two drives connected to SATA 1 and 2, and a third drive connected to SATA 3 sends things for a loop. If I do not connect anything to SATA 3 and 4,instead connecting 2 drives to the motherboard and 2 drives to my 3rd party RAID card... everything works fine... so with 3 drives I get failures, but with 4 it's fine as long as the 3rd and 4th are not plugged into the motherboard SATA controller.
I have a fairly new 600W PSU from BeQuiet in the case... so there should be enough power for all devices.
Bigger PSUs are often harder to work with. I believe they tend to have multiple separate power subsystems. If you are indiscriminate about which connectors you use you can overload one subsystem while the overall unit is just chugging along fine. I think they call each subsystem a lane?
The old classic 450W was just one big system, so all the connectors were effectively equivalent.
Clayton, I think you said you were going to try an additional PSU. Did you? Did it help? Greg -- Greg Freemyer Litigation Triage Solutions Specialist http://www.linkedin.com/in/gregfreemyer First 99 Days Litigation White Paper - http://www.norcrossgroup.com/forms/whitepapers/99%20Days%20whitepaper.pdf The Norcross Group The Intersection of Evidence & Technology http://www.norcrossgroup.com -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
I have a fairly new 600W PSU from BeQuiet in the case... so there should be enough power for all devices.
Bigger PSUs are often harder to work with. I believe they tend to have multiple separate power subsystems. If you are indiscriminate about which connectors you use you can overload one subsystem while the overall unit is just chugging along fine. I think they call each subsystem a lane?
The old classic 450W was just one big system, so all the connectors were effectively equivalent.
Clayton,
I think you said you were going to try an additional PSU. Did you? Did it help?
No I didn't buy a bigger PSU (I currently have a BeQuiet 600W http://www.be-quiet.net/be-quiet.net/index.php?StoryID=14 ) Turns out I was running up against a rather nasty bug in the sata kernel module (bug number 331610). There is a fix/patch for the problem that can be loaded at install but I never tried it... I was already back on 10.2. I might try again at some point... but probably by the time I get around to it, 11.0 will be ready :-) The short of it was... 10.3 will not install/run on my system without a patch to the kernel. 10.2 works fine. C -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Friday 25 January 2008 07:43, Clayton wrote:
I've posted a couple times about this with no replies yet....
Earlier today, the entire computer came crashing to a halt... so it forced me to spend more time looking into the problem.
The motherboard I have (ASUS M2N-e SLI) has 4 SATA2 ports. SATA 1, 2, 3 and 4. I also have a SATA1 RAID controller with 2 SATA ports. I have drives connected on IDE0 and IDE1 and they are working fine.
Scenario 1: If I leave the RAID card out, and just connect drives to SATA 1 and SATA 2.... the computer boots fine. BIOS finds the SATA drives, and Linux is happy.
Scenario 2: If I add drives to SATA 3 and 4 in Scenario 1, the BIOS sees all four drive2, but when I boot Linux, it errors out. I can boot the OS, but the error logs fill up with errors, and I have serious performance issues.. until it just dies altogether.
The boot errors look like this: ----------------- <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <4>ata3: failed to recover some devices, retrying in 5 secs <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <3>ata3.00: revalidation failed (errno=-5) <4>ata3: limiting SATA link speed to 1.5 Gbps <4>ata3.00: limiting speed to UDMA7:PIO5 <4>ata3: failed to recover some devices, retrying in 5 secs ------------------ and continue on for quite some time.
Scenario 3: If I add the RAID card in to Scenario 1, but do not connect any drives to the RAID, all boots and works OK.
Scenario 4: If I connect 2 SATA drives to the RAID card, and have two drives from Scenario 1 also connected, all works and boots OK.
Scenario 5: If I connect a SATA drive to SATA 3 or 4 in Scenario 4, I get the same results as with Scenario 2... a long list of SATA errors on the boot.
Has anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing?
Hi Clayton, Was all this working before your crash? In reading your posts (this one and others), I've not seen any scenario since the crash where disks on SATA ports 3 and 4 work with 10.3, at best they are recognized by the bios. Have you eliminated the possibility of any hardware problems with those two ports? I would suggest booting your Scenario 1 but with the two disks attached to SATA 3 and 4 (instead of 1 and 2). (You might have to adjust /boot/grub/device.map temporarily.) Perhaps boot just one disk at a time? Maybe boot with a live CD to see if there is something specific to 10.3? Not sure what else to suggest. Best of luck. -- Don -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Fri, 2008-01-25 at 15:43 +0100, Clayton wrote:
I've posted a couple times about this with no replies yet.... as anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing?
C. I can't add any meaningful info, but a month or so ago there were several posts of issues with sata, including my case (an Asus A7V600 MB /w 2 sata drives). The initial sluggish performance and the resultant crash ate all the log messages and one drive. I have not had time to investigate or re-create the issue. I am rather certain that it happened after an online update, about 2 months after going from 10.2 to 10.3. I do believe there is a grimlin lurking here, and not a hardware issue. (My present config of 1 sata and 1 EIDE is fine).
Tom in NM -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Friday 25 January 2008 14:43:23 Clayton wrote:
I've posted a couple times about this with no replies yet....
Earlier today, the entire computer came crashing to a halt... so it forced me to spend more time looking into the problem.
The motherboard I have (ASUS M2N-e SLI) has 4 SATA2 ports. SATA 1, 2, 3 and 4. I also have a SATA1 RAID controller with 2 SATA ports. I have drives connected on IDE0 and IDE1 and they are working fine.
Scenario 1: If I leave the RAID card out, and just connect drives to SATA 1 and SATA 2.... the computer boots fine. BIOS finds the SATA drives, and Linux is happy.
Scenario 2: If I add drives to SATA 3 and 4 in Scenario 1, the BIOS sees all four drive2, but when I boot Linux, it errors out. I can boot the OS, but the error logs fill up with errors, and I have serious performance issues.. until it just dies altogether.
The boot errors look like this: ----------------- <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <4>ata3: failed to recover some devices, retrying in 5 secs <6>ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300) <4>ata3.00: qc timeout (cmd 0x27) <4>ata3.00: failed to read native max address (err_mask=0x4) <3>ata3.00: revalidation failed (errno=-5) <4>ata3: limiting SATA link speed to 1.5 Gbps <4>ata3.00: limiting speed to UDMA7:PIO5 <4>ata3: failed to recover some devices, retrying in 5 secs ------------------ and continue on for quite some time.
Scenario 3: If I add the RAID card in to Scenario 1, but do not connect any drives to the RAID, all boots and works OK.
Scenario 4: If I connect 2 SATA drives to the RAID card, and have two drives from Scenario 1 also connected, all works and boots OK.
Scenario 5: If I connect a SATA drive to SATA 3 or 4 in Scenario 4, I get the same results as with Scenario 2... a long list of SATA errors on the boot.
Has anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing?
C.
Please look at bug 331610 (and vote for it, if you think it relates to your problem). -- Bob openSUSE 10.3, Kernel 2.6.22.13-0.3-default, KDE 3.5.8 Intel Celeron 2.53GB, 2GB DDR RAM, nVidia GeForce 7600GS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Has anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing?
Please look at bug 331610 (and vote for it, if you think it relates to your problem).
I've looked at the bug, and voted for it.. it is essentially exactly my problem. I got a bit further than other people because I have IDE drives in the mix. How does my voting for this bug help though? It is closed as Fixed.... even thouhg it appears that no one has seen a fix for it yet... how is this fix supposed to be applied since the problem is there on the master ISO? I have rolled back to 10.2 and everything is running fine once again. All the weird problems - including the SATA errors, the screen resolution problems and the MPlayer video driver problems that I was fighting with 10.3 are gone now that I am running 10.2. For now... 10.3 is a total write-off for me. I cannot install it on my computer (AMD 64X2 6400+, 4GB RAM, ASUS M2N-E motherboard)... well I can install it, but it is unstable, crashes all the time etc etc. reminds me of 10.1 :-( C -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Saturday 26 January 2008 22:15:46 Clayton wrote: Bob wrote:
Please look at bug 331610 (and vote for it, if you think it relates to your problem).
I've looked at the bug, and voted for it.. it is essentially exactly my problem. I got a bit further than other people because I have IDE drives in the mix.
Me too. I have one IDE drive and one SATA drive. I've stuck with 10.3 even though my SATA drive is currently unusable :(
How does my voting for this bug help though? It is closed as Fixed.... even thouhg it appears that no one has seen a fix for it yet... how is this fix supposed to be applied since the problem is there on the master ISO?
Ooops. I forgot it had been fixed. I guess I thought that another vote would speed up rolling out of the repaired driver modules.
I have rolled back to 10.2 and everything is running fine once again. All the weird problems - including the SATA errors, the screen resolution problems and the MPlayer video driver problems that I was fighting with 10.3 are gone now that I am running 10.2.
For now... 10.3 is a total write-off for me. I cannot install it on my computer (AMD 64X2 6400+, 4GB RAM, ASUS M2N-E motherboard)... well I can install it, but it is unstable, crashes all the time etc etc. reminds me of 10.1 :-(
The SATA problem is the only one I've had with 10.3. I certainly wouldn't compare it to 10.1. There are niggly problems such as beagle, but that has been covered in other threads. Good luck, and here's hoping it's fixed before 11.0 GM comes out ;) -- Bob openSUSE 10.3, Kernel 2.6.22.13-0.3-default, KDE 3.5.8 Intel Celeron 2.53GB, 2GB DDR RAM, nVidia GeForce 7600GS -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Me too. I have one IDE drive and one SATA drive. I've stuck with 10.3 even though my SATA drive is currently unusable :(
I tried, but had so many other little issues. The computer would lock solid after about 2 to 4 hours of uptime... The video was screwed up (MPlayer would not play full screen properly, and MythTV would not autoscale the UI etc etc)... yet using the same X11 version and exact same nVidia driver binary in 10.2 and video works perfectly. Beagle... I never installed it in 10.3 so never encountered any problems with it :-) There were other issues... nothing major, but so many as to make 10.3 totally unusable for everyday use :-( Shame since there are so many nice new things in 10.3.
Good luck, and here's hoping it's fixed before 11.0 GM comes out ;)
Well, 10.2 is fine for me for now. It works great, and is rock solid on my hardware. If I have to wait until 11.0 for a new install, that's fine... I don't mind. I will be rather grouchy though if the same SATA bug is still there in 11.0. :-P C. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sat, 26 Jan 2008 23:15:46 +0100, Clayton wrote:
how is this fix supposed to be applied since the problem is there on the master ISO?
As Tejun wrote, by way of a driver update disk. You insert the disk at installation time and the installation will use updated drivcers it finds on it. Philipp -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On Sun, 2008-01-27 at 15:35 +0100, philipp.thomas@t-link.de wrote:
On Sat, 26 Jan 2008 23:15:46 +0100, Clayton wrote:
how is this fix supposed to be applied since the problem is there on the master ISO?
As Tejun wrote, by way of a driver update disk. You insert the disk at installation time and the installation will use updated drivcers it finds on it.
Philipp I recall there were two or three bugs posted, then combined onto one. I voted for one of them, and fell back to a new installation on one sata and one EIDE drive. I have not had time to returned to the issue. I rather assumed that a future kernel update (of which I have kept current) would include whatever fix they came up with.
(In my case, the bug toasted the sdb7 partition (/usr/lib). My /local and /data partitions were raid5's on the two sata drives, and I have not done anything yet to attempt a recovery of that information.) Are you saying that it requires a separate driver cd and a total re-install to cure the issue???? If so, I will keep my existing system until 11.0 hits the street. Tom in NM -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Clayton wrote:
Has anyone encountered this before? Could it be a hardware issue.. a failing SATA controller on the motherboard, or is it some obscure Linux thing? Please look at bug 331610 (and vote for it, if you think it relates to your problem). I've looked at the bug, and voted for it.. it is essentially exactly my problem. I got a bit further than other people because I have IDE drives in the mix. How does my voting for this bug help though? It is closed as Fixed.... even though it appears that no one has seen a fix for it yet... how is this fix supposed to be applied since the problem is there on the master ISO?
Tom Patton wrote:
Are you saying that it requires a separate driver cd and a total re-install to cure the issue? -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (8)
-
Bob
-
Clayton
-
Don Raboud
-
Felix Miata
-
Greg Freemyer
-
philipp.thomas@t-link.de
-
Philippe Landau
-
Tom Patton