[opensuse] ACPI regression from 11.2 -> 11.3 on GA-MA770T-UD3P
Hello, I'm hoping that some kind soul can help me diagnose what's going on here, or at least point me to another resource. Several weeks ago, I upgraded (first via zypper dup, then with a re-install) from 11.2 to 11.3 on a system with the above mobo. Since then, I've encountered all kinds of difficulties that I now believe to all be ACPI related. Note the the system had been very stable running 11.2, but I wanted the new Xorg driver for my ATI 4350 card so that I could take advantage of the accelerated driver and associated eye candy. Initially, I noticed network problems, and kernel messages in /var/log/messages with CPU#n suck messages. I've attached a snippet below. Aug 13 13:30:02 faramir kernel: [ 4331.903317] BUG: soft lockup - CPU#0 stuck for 139s! [swapper:0] I eventually noticed that *most* these included the rtl8169 in the call stack. After some Googling, with no luck, I picked up a used Intel 82545GM from eBay to replace the built-in RealTek - it's a better chipset anyway, right? Well, after this, the above error traces went away, but not all of the bad behaviors. Remaining were the very long boot times (10 minutes) and random system hangs (5-30 seconds). I discovered that any keypress (i.e Num-Lock) would "unstick" the system, I surmised by generating an interrupt. Since then, I've been trying to play with combinations of ACPI=xx APM=xxx and pci=noacpi. After many experiments, I've the only combination that I could find that would make my system run well is "apm=off pci=noacpi acpi=off" -- in particular, "acpi=off". I've tried other ACPI options, but, at least so far, to no avail. The main downside that I've discovered to "acpi=off" is that my 4 CPU cores (AMD Phenom(tm) II X4 925 Processor) are pegged at 2812.342 MHz, with my CPU fan running full-tilt to keep 'em cool. I've also attached the output of 'lspci' and 'dmidecode'. Let me know if there's anything else that I can provide, as well. Just for the record, I'm certainly *not* a newbie, and I'm willing to experiment and/or debug. Thanks in advance, -Nick
Hi Nick I have the same motherboard. After opgrade I had all kind of soft lockups and system freezes after several weeks and many sudpections of soft and hardware I reseted the bios to failsafe default and all my problems were gone. I suspect the setting C1E or virtualization in the bios. I think this help you allso out of your problems. Succes, Hans On 31/08/10 04:11, Nick LeRoy wrote:
Hello,
I'm hoping that some kind soul can help me diagnose what's going on here, or at least point me to another resource.
Several weeks ago, I upgraded (first via zypper dup, then with a re-install) from 11.2 to 11.3 on a system with the above mobo. Since then, I've encountered all kinds of difficulties that I now believe to all be ACPI related. Note the the system had been very stable running 11.2, but I wanted the new Xorg driver for my ATI 4350 card so that I could take advantage of the accelerated driver and associated eye candy.
Initially, I noticed network problems, and kernel messages in /var/log/messages with CPU#n suck messages. I've attached a snippet below.
Aug 13 13:30:02 faramir kernel: [ 4331.903317] BUG: soft lockup - CPU#0 stuck for 139s! [swapper:0]
I eventually noticed that *most* these included the rtl8169 in the call stack. After some Googling, with no luck, I picked up a used Intel 82545GM from eBay to replace the built-in RealTek - it's a better chipset anyway, right?
Well, after this, the above error traces went away, but not all of the bad behaviors. Remaining were the very long boot times (10 minutes) and random system hangs (5-30 seconds). I discovered that any keypress (i.e Num-Lock) would "unstick" the system, I surmised by generating an interrupt.
Since then, I've been trying to play with combinations of ACPI=xx APM=xxx and pci=noacpi. After many experiments, I've the only combination that I could find that would make my system run well is "apm=off pci=noacpi acpi=off" -- in particular, "acpi=off". I've tried other ACPI options, but, at least so far, to no avail.
The main downside that I've discovered to "acpi=off" is that my 4 CPU cores (AMD Phenom(tm) II X4 925 Processor) are pegged at 2812.342 MHz, with my CPU fan running full-tilt to keep 'em cool.
I've also attached the output of 'lspci' and 'dmidecode'. Let me know if there's anything else that I can provide, as well.
Just for the record, I'm certainly *not* a newbie, and I'm willing to experiment and/or debug.
Thanks in advance,
-Nick -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello,
I'm hoping that some kind soul can help me diagnose what's going on here, or at least point me to another resource.
Several weeks ago, I upgraded (first via zypper dup, then with a re-install) from 11.2 to 11.3 on a system with the above mobo. Since then, I've encountered all kinds of difficulties that I now believe to all be ACPI related. Note the the system had been very stable running 11.2, but I wanted the new Xorg driver for my ATI 4350 card so that I could take advantage of the accelerated driver and associated eye candy.
Initially, I noticed network problems, and kernel messages in /var/log/messages with CPU#n suck messages. I've attached a snippet below.
Aug 13 13:30:02 faramir kernel: [ 4331.903317] BUG: soft lockup - CPU#0 stuck for 139s! [swapper:0]
I eventually noticed that *most* these included the rtl8169 in the call stack. After some Googling, with no luck, I picked up a used Intel 82545GM from eBay to replace the built-in RealTek - it's a better chipset anyway, right?
Well, after this, the above error traces went away, but not all of the bad behaviors. Remaining were the very long boot times (10 minutes) and random system hangs (5-30 seconds). I discovered that any keypress (i.e Num-Lock) would "unstick" the system, I surmised by generating an interrupt.
Since then, I've been trying to play with combinations of ACPI=xx APM=xxx and pci=noacpi. After many experiments, I've the only combination that I could find that would make my system run well is "apm=off pci=noacpi acpi=off" -- in particular, "acpi=off". I've tried other ACPI options, but, at least so far, to no avail.
The main downside that I've discovered to "acpi=off" is that my 4 CPU cores (AMD Phenom(tm) II X4 925 Processor) are pegged at 2812.342 MHz, with my CPU fan running full-tilt to keep 'em cool.
I've also attached the output of 'lspci' and 'dmidecode'. Let me know if there's anything else that I can provide, as well.
Just for the record, I'm certainly *not* a newbie, and I'm willing to experiment and/or debug.
Thanks in advance,
-Nick
I have the same mobo, with no problems either in 11.2 or 11.3. I compared your setup with mine, the primary diffs on mine are: * RTL8111 Not had any issues at all; hung off a Gb router * Nvidia graphics card Got the correct kernel mode setting for your ATI card? I've noticed the documentation on using KMS is not entirely consistent. Your card may need it disabled. It can be done with a kernel boot argument or a change to the initrd via /etc/sysconfig (check the release notes or SDB). * 8GB RAM G.Skill DDR3 1600 stock timings * X2 550 * Bios F8 I would concentrate here first. The combination of acpi issues and that apparently your cpu is not throttling as indicated by the peg plus fan speeds, suggests an issue with those bios settings and/or chip vis-a-vis the kernel. So . . . first, upgrade your F5 bios to F8, especially since you have a quad; IIRC the ACC bios section initially had a few issues. (I went from F4 to F8 so no experience with F5.) I would also try using the "optimized defaults" option (which is recommended anyway upon upgrading the bios, after which your tweaks can be added back). Second, try disabling cool-n-quiet and the SMART fan controls, and add the boot argument "cpufreq=no". This will disable the throttling; see if that makes a difference. Third, try adding the boot argument "pci=nomsi". Fourth, try a different kernel (more than one can be installed side-by-side). * initrd modules: thermal pata_atiixp ahci ata_generic processor fan atiixp ide_pci_generic Appears you've have an added Adapter controller (hardware RAID?), didn't see any issues there. Hope something above helps. Good luck. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
* Nick LeRoy (leroy.nick@gmail.com) [20100831 04:12]:
Just for the record, I'm certainly *not* a newbie, and I'm willing to experiment and/or debug.
Then why don't you file a bug report in bugzilla? Chances that one of our kernel developers gets to see it there is immensely higher than here. Philipp -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 8/30/2010 7:11 PM, Nick LeRoy wrote:
Since then, I've been trying to play with combinations of ACPI=xx APM=xxx and pci=noacpi. After many experiments, I've the only combination that I could find that would make my system run well is "apm=off pci=noacpi acpi=off" -- in particular, "acpi=off". I've tried other ACPI options, but, at least so far, to no avail.
Are you sure ACPI was ever ON? Is there a chance that your mobo fell out of the ACPI approved list and acpi was disabled all along? The only way to tell is read carefully in the /var/log/boot.msg -- _____________________________________ At one time I had a Real Sig. Its been downsized. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
For your attention: My system runs for 2 days without any slowdown or freeze or other problem. Setting in the bios C1E support enabled gives me within an half hour all kind of display hickups and other troubles at random moments. I think this is a (the) problem related to OS 11.3 and GA-MA770T-UD3P Success, hans On 31/08/10 21:47, John Andersen wrote:
On 8/30/2010 7:11 PM, Nick LeRoy wrote:
Since then, I've been trying to play with combinations of ACPI=xx APM=xxx and pci=noacpi. After many experiments, I've the only combination that I could find that would make my system run well is "apm=off pci=noacpi acpi=off" -- in particular, "acpi=off". I've tried other ACPI options, but, at least so far, to no avail.
Are you sure ACPI was ever ON? Is there a chance that your mobo fell out of the ACPI approved list and acpi was disabled all along?
The only way to tell is read carefully in the /var/log/boot.msg
-- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
* Hans de Faber <hans.defaber@gmail.com> [08-31-10 16:12]:
For your attention: My system runs for 2 days without any slowdown or freeze or other problem. Setting in the bios C1E support enabled gives me within an half hour all kind of display hickups and other troubles at random moments.
I think this is a (the) problem related to OS 11.3 and GA-MA770T-UD3P
And *you* have made a bug-report. Else, how do you expect it to be solved? -- Patrick Shanahan Plainfield, Indiana, USA HOG # US1244711 http://wahoo.no-ip.org Photo Album: http://wahoo.no-ip.org/gallery2 Registered Linux User #207535 @ http://counter.li.org -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
On 31/08/10 22:39, Patrick Shanahan wrote:
* Hans de Faber <hans.defaber@gmail.com> [08-31-10 16:12]:
For your attention: My system runs for 2 days without any slowdown or freeze or other problem. Setting in the bios C1E support enabled gives me within an half hour all kind of display hickups and other troubles at random moments.
I think this is a (the) problem related to OS 11.3 and GA-MA770T-UD3P
And *you* have made a bug-report.
Else, how do you expect it to be solved?
No, Thats not my way of doing things. When you start with such problem eveything is suspected. You have to sort out things first. Then look at the mailing list and if there are no lookalike troubles in the malinglist then post your troubles on the list. If there is no response then its unlikely that it is a structural error in opensuse. Than switchs back to 11.2 to do your work and keep watching the mailinglist. If there is still nothing after some weeks then the problem is likely specific to my configuration. My laptop, also amd hardware,runs flawless on 11.3. Then its time to do something rigorous. I changed my videocard from nvidia to amd because the most common errors where display freezes and reinstalled opensuse 11.3 on a spare partition. I got the same troubles, impossible I cried. I that point the motherboard was the number 1 suspect. First I changed my bios to failsafe defaults and the problem was gone. So if I had filed a bugreport I was still waiting now. Lesson; To file a bugreport The primary suspect should be something in the software !!! -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
For your attention: My system runs for 2 days without any slowdown or freeze or other problem. Setting in the bios C1E support enabled gives me within an half hour all kind of display hickups and other troubles at random moments.
I think this is a (the) problem related to OS 11.3 and GA-MA770T-UD3P
Success, hans
My C1E setting is "auto". -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, Here's a follow up. Following the advice of dwgallien <dwgallien@gmail.com>, I updated my BIOS from F5 to F10. At least preliminarily, this seems to have solved the problems. I haven't tried using the RealTek network controller, but, after running for a few minutes, I haven't had any of the otherwise frequent hiccups or other problems. I'll post another follow-up after letting it run for a bit longer. I feel stupid for not having thought of the BIOS update myself, but, oh, well. Thanks *very much* to all who responded so quickly. -Nick -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
Hello, Final follow up:
Here's a follow up. Following the advice of dwgallien <dwgallien@gmail.com>, I updated my BIOS from F5 to F10. At least preliminarily, this seems to have solved the problems. I haven't tried using the RealTek network controller, but, after running for a few minutes, I haven't had any of the otherwise frequent hiccups or other problems. I'll post another follow-up after letting it run for a bit longer. I feel stupid for not having thought of the BIOS update myself, but, oh, well.
After a couple weeks running with the upgraded BIOS, I can confirm that this did solve the problems I was experiencing. So, if you're running a GA-MA770T- UD3P, upgrade the BIOS to F10. Do so now. :) Thanks again for all the help. -Nick -- <<< The matrix has you. >>> /`-_ Nick LeRoy { }/ http://www.cs.wisc.edu/~nleroy http://www.cs.wisc.edu/~nleroy \ / leroy.nick@gmail.com The University of Wisconsin |_*_| 920-568-0151 Department of Computer Sciences -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org For additional commands, e-mail: opensuse+help@opensuse.org
participants (6)
-
dwgallien
-
Hans de Faber
-
John Andersen
-
Nick LeRoy
-
Patrick Shanahan
-
Philipp Thomas