[Bug 215300] New: nForce4 based motherboard unstable in SMP configuration
https://bugzilla.novell.com/show_bug.cgi?id=215300 Summary: nForce4 based motherboard unstable in SMP configuration Product: SUSE Linux 10.1 Version: Final Platform: x86-64 OS/Version: SuSE Linux 10.1 Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: grok@tnt.pl QAContact: qa@suse.de Running an AMD Athlon 64 X2 Dual core with PCI-E based motherboard under -smp kernel causes the machine to be unstable. Suggested fix "pci=nommconf" does not help in the long term. The following configurations were tested: Motherboards: - ASUS A8N-SLI Premium - ASUS A8N-SLI Deluxe Graphic cards: - 2x nVidia Corporation GeForce 7800 GTX (both SLI and separate) - 1x nVidia Corporation GeForce 7950 GX2 Memory: - 2x 1GB (dual channel configuration) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #1 from grok@tnt.pl 2006-10-26 09:02 MST ------- (description continued) - 1x 1GB (single channel configuration) Graphic card drivers: - nVidia 8762 (SUSE rpms) - nVidia 9625 (.run file) The instability manifests itself identically as when pci=nommconfig is not used (crashes, freezes etc). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #2 from kkeil@novell.com 2006-10-26 09:18 MST ------- This maybe related to SLI not to PCI-E or nForce4 in general, I run a ASUS A8N-E board with GeForce 6600 and a AMD Athlon 64 X2 Dual core since January as my main development system. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #3 from grok@tnt.pl 2006-10-26 09:24 MST ------- I have tested both the SLI configuration, separate screens and xinerama. All are unstable consistently. However I run quite substantial screen-estate (1280x1600 - 2560x1600 - 1280x1600) and have noticed that the higher the resolution is, the quicker the instabilities crop up. When using -smp and pci=nommconfig, the instabilities manifest themselves much later (matter of 10-20 minutes) instead of immediately (matter of 30 seconds from starting anything but 2D desktop -- i.e. glxgears). One of the early symptoms before the system goes down in a freeze is intermittent, extremely rapid keypress repeat. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #4 from grok@tnt.pl 2006-10-26 09:40 MST ------- Forgot to add. When testing with GeForce 7950 GX2 I also tried a configuration with a regular PCI graphics card (FX5200 PCI). And despite using only one PCI-E slot, the configuration was also unstable (exactly same results). Seems that it may be after all related somehow to the PCI-E and SMP. Ofcourse switching to -default kernel (only 1 CPU) solves the problem completely. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |grok@tnt.pl ------- Comment #5 from gregkh@novell.com 2006-10-30 12:20 MST ------- Can you attach the output of 'hwinfo'? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #6 from grok@tnt.pl 2006-10-31 03:19 MST ------- Created an attachment (id=103147) --> (https://bugzilla.novell.com/attachment.cgi?id=103147&action=view) uniprocessor hwinfo hwinfo of stable configuration -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED ------- Comment #7 from grok@tnt.pl 2006-10-31 03:20 MST ------- Created an attachment (id=103148) --> (https://bugzilla.novell.com/attachment.cgi?id=103148&action=view) SMP configuration hwinfo of unstable configuration -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #8 from grok@tnt.pl 2006-10-31 03:22 MST ------- During the making of smp.hwinfo the instabilities immediately showed up. There were spontanous rapid keypress repeats. This does not occur if I use a PCI graphics card instead of the PCI-E. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |grok@tnt.pl ------- Comment #9 from gregkh@novell.com 2006-10-31 11:48 MST ------- Does the problem go away when you do not have the nvidia closed source driver loaded? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED ------- Comment #10 from grok@tnt.pl 2006-11-01 01:57 MST ------- Created an attachment (id=103269) --> (https://bugzilla.novell.com/attachment.cgi?id=103269&action=view) nv-based xorg.conf Yes.. WWhaaat yooouuu see here is whhaat hhapppeendss tooo tthhee keyyybbboaarrd iinnputt. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #11 from grok@tnt.pl 2006-11-01 02:14 MST ------- About 5 minutes after posting the attachment, I was unable to move cursor between screens, after further 2-3 minutes I was unable to move windows around and about a minute later the X froze completely. The machine still responded to ICMP echo requests and I was able to ssh in and killall -9 X. After that X never came back (garbage on screen, local keyboard locked (both USB and PS2) and obviously unable to use VT1-6). The ssh session was still live. Switching back to UP (-default) kernel fixes the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Platform|x86-64 |i386 ------- Comment #12 from grok@tnt.pl 2006-11-01 02:15 MST ------- I'm running 32-bit version of the OS. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #13 from grok@tnt.pl 2006-11-01 02:17 MST ------- Uhm, excuse the poor choice of words. No, the problem does NOT go away if I switch to opensource drivers and remove proprietary driver completely from the system (including libglx etc.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |grok@tnt.pl ------- Comment #14 from gregkh@novell.com 2006-11-01 23:14 MST ------- Can you attach the output of 'hwinfo' with the unstable configuration without the nvidia drivers loaded? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED ------- Comment #15 from grok@tnt.pl 2006-11-02 02:43 MST ------- Created an attachment (id=103468) --> (https://bugzilla.novell.com/attachment.cgi?id=103468&action=view) SMP with opensource nv driver hwinfo -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #16 from grok@tnt.pl 2006-11-02 02:47 MST ------- (From update of attachment 103269) The Attachment #103269 has old (invalid) PCI IDs. The correct values are:
Identifier "Device[0]"
[...]
BusID "PCI:1:0:0"
Identifier "Device[1]" [...] BusID "PCI:2:0:0"
Identifier "Device[2]" [...] BusID "PCI:2:0:0"
Sorry about that, I have way too many xorg.confs :) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #17 from grok@tnt.pl 2006-11-13 06:51 MST ------- Hi, Is there any other debug information I could supply? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #18 from vogt@itwm.fraunhofer.de 2006-11-14 06:07 MST ------- Hello, I like to confirm this problems. Currently we are running some Fujitsu Siemens Esprimo machines with AMD Dual Core X2 and x86_64. It seems to be related to dual core, other single core machines are fine. The freeze happens 3-4 time a day. Its not related to "load" usually it happens when you surf in the net etc... Nothing in the log. Machine dead (no ping,...) regards, Martin PS: nforce4 board -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |grok@tnt.pl ------- Comment #19 from gregkh@novell.com 2006-11-15 17:04 MST ------- Can you try the 10.2 Beta2 releases? There are a number of dual core issues fixed there. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #20 from grok@tnt.pl 2006-11-15 17:21 MST ------- I can't upgrade the whole system (yet), but I can rebuild kernel-2.6.18.1-24.4 on my 10.1 (to get kernel sources I can build against). Would that do? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #21 from grok@tnt.pl 2006-11-15 17:25 MST ------- Make that 2.6.18.2-12. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #22 from gregkh@novell.com 2006-11-15 17:43 MST ------- It would be a good start, yes. But please note, that with this kernel release, other parts of your system might not work properly (udev, network manager, HAL, etc.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|grok@tnt.pl | ------- Comment #23 from grok@tnt.pl 2006-11-16 07:19 MST ------- 2.6.18.2-5-bigsmp #1 SMP Tue Nov 7 16:02:06 GMT 2006 i686 athlon i386 GNU/Linux seems stable (for as much testing I've done already). The keyboard issues do not seem to surface at the timescale experienced before. I have used the following packages (later bigsmp.nosrc was not available): rpm -ivh kernel-source-2.6.18.2-5.src.rpm rpmbuild --rebuild kernel-bigsmp-2.6.18.2-5.nosrc.rpm and installed them (ignoring the kernel-syms dependency). There were no issues with udev/NM/HAL etc. (i.e. nothing failed on boot). I would suggest leaving it opened for a day or two for longer testing. Meanwhile, is it possible to back-port the fixes to 10.1 for the next update? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 ------- Comment #24 from vogt@itwm.fraunhofer.de 2006-11-20 00:45 MST ------- Hello, I can install the 10.2 kernel on a test machine, currently the production machine has powersaved disabled (service stopped) and the machine has an uptime of 4 days. ==> maybe its powersaved related. Martin -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED ------- Comment #25 from gregkh@novell.com 2006-11-29 16:44 MST ------- So 10.2 works? great, we'd recommend using that instead :) If not, please reopen and let us know. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |REOPENED Resolution|FIXED | ------- Comment #26 from grok@tnt.pl 2006-12-04 03:21 MST ------- Since this is filed under 10.1, I would like to ask if/when one could backport the fix for 10.1? Running a kernel from a newer version may be sub-optimal (security, updates, maintainability). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 lmb@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED ------- Comment #27 from lmb@novell.com 2007-01-09 11:48 MST ------- We will not backport fixes from 10.2 to 10.1, sorry. the openSUSE series is community supported, and we always only support the latest version except for security fixes. Please update to 10.2. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=215300 grok@tnt.pl changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED ------- Comment #28 from grok@tnt.pl 2007-01-10 01:54 MST ------- That's ok (now :). At the time 10.1 was perfectly reasonable. Thanks anyway! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com