[Bug 231205] New: Freeze very early in boot process with SMP
https://bugzilla.novell.com/show_bug.cgi?id=231205 Summary: Freeze very early in boot process with SMP Product: openSUSE 10.2 Version: Final Platform: i686 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: fkamogee@yahoo.com QAContact: qa@suse.de This system can run SUSE 10.1's SMP kernel, but both the 10.2 installer and the installed 10.2 system hang very quickly at boot unless SMP is disabled using kernel parameter nosmp or maxcpus. The last line printed on the installed system's console before it hangs is "NET: Registered protocol family 2". When SMP is disabled, everything seems fine, and the next line is "IP route cache hash table entries: 32768 (order: 5, 131072 bytes)". I have tried the kernel of the day (2.6.18.5-SL102_BRANCH_20061223002647-default), and I have tried a number of other kernel parameters including acpi=off, apm=off, ide=nodma, pci=routeirq, edd=off, noapic, nolapic, init=/bin/sh... I have not found anything that lets the system boot without disabling SMP. There is no OOPS or PANIC or printed on screen or in any logs that I can find. In fact I'm pretty sure it freezes before anything is written to disk at all. I tried hooking up netconsole and got nothing at all, but I can't say for sure that it was configured correctly. I don't have the equipment to set up a serial console, but if that's necessary I could probably track it down. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #1 from fkamogee@yahoo.com 2006-12-31 17:04 MST ------- Created an attachment (id=111231) --> (https://bugzilla.novell.com/attachment.cgi?id=111231&action=view) hwinfo output -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 fkamogee@yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fkamogee@yahoo.com ------- Comment #2 from fkamogee@yahoo.com 2007-01-01 14:30 MST ------- After continued googling, I got a hunch that the "stack unwinder" could be the source of my troubles. Linus removed it altogether for 2.6.20-rc2. I just built a vanilla 2.6.20-rc3 kernel, and sure enough, it boots correctly (with nolapic). I will try to narrow down which kernel versions work and which don't. If my hunch is correct, that'll be an easy task. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #3 from fkamogee@yahoo.com 2007-01-01 21:09 MST ------- To take my build process out of the equation, I verified that a custom build of the default SUSE kernel does in fact exhibit the freeze. Vanilla 2.6.19 boots. Maybe the unwinder fixes therein are enough to take care of it, or maybe it's something else entirely. I'll have more information tomorrow. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #4 from fkamogee@yahoo.com 2007-01-02 12:51 MST ------- I built the default SUSE kernel (2.6.18.2-34) again, this time with CONFIG_UNWIND_INFO disabled, and it still froze up. I guess the unwinder is not the problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #5 from fkamogee@yahoo.com 2007-01-02 18:15 MST ------- Vanilla 2.6.18.2 boots. Looks like one of the patches in -34 must be the culprit. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 fkamogee@yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Keywords| |SMP OS/Version|Other |SuSE Other ------- Comment #6 from fkamogee@yahoo.com 2007-01-02 18:28 MST ------- Yikes, there are many patches, can somebody give me a clue as to which ones might be suspect? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 hrieger@izeit.de changed: What |Removed |Added ---------------------------------------------------------------------------- OtherBugsDependingO| |227279 nThis| | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 lmb@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |fkamogee@yahoo.com ------- Comment #7 from lmb@novell.com 2007-01-05 09:20 MST ------- Say, does booting with maxcpus=0 work for you? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 fkamogee@yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|fkamogee@yahoo.com | ------- Comment #8 from fkamogee@yahoo.com 2007-01-05 11:29 MST ------- Yes, that's effectively the same as nosmp, isn't it? With either, I can boot and have only one CPU enabled. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #9 from alyaz@iam.uni-bonn.de 2007-01-10 06:56 MST ------- does booting kernel-bigsmp work for you? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #10 from Joseph.Comfort@asu.edu 2007-01-10 12:49 MST ------- (In reply to comment #9) I have nearly identical problems with an x86_64 installation. Curiously, a i586 dvd iso does not cause hangs or freezes. There is no kernel-bigsmp for x86_64 that I can see. There is a lot more information available for bug 232013. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ludek@dolejsky.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ludek@dolejsky.com ------- Comment #11 from ludek@dolejsky.com 2007-01-11 14:49 MST ------- I got the same problem. Running Suse 10.2 on Lenovo 3000 n100 dual core and system hangs while booting. When I pass "nosmp" to kernel, it works fine (using just one core). Tried to use Vanilla kernel and there were no problems (system used both cores). Would be great if this get fixed... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |fkamogee@yahoo.com ------- Comment #12 from gregkh@novell.com 2007-01-11 21:33 MST ------- Paul, can you get a kernel log oops message when the machine hangs? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #13 from Joseph.Comfort@asu.edu 2007-01-11 21:48 MST ------- (In reply to comment #11) I have been watching this bug and also bug 232013. Following a suggestion in 232013, I downloaded and installed the latest kotd. All problems are resolved. Whatever got fixed needs to be backported. New install iso's also need to be posted on the download sites. For comment #12, there are a bunch of logs in bug 232103. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #14 from gregkh@novell.com 2007-01-11 21:55 MST ------- Thanks for letting us know that the KOTD fixes this issue. But merely backporting will not work, as that kernel is based on 2.6.20-rc4 or so. And the 10.2 kernel is 2.6.18 based. _lots_ of things have changed inbetween these releases :) If the KOTD works for you, I'd recommend just using that. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 kasievers@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |Joseph.Comfort@asu.edu ------- Comment #15 from kasievers@novell.com 2007-01-12 00:06 MST ------- *** Bug 231056 has been marked as a duplicate of this bug. *** -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #16 from fkamogee@yahoo.com 2007-01-12 09:28 MST ------- Glad to see some activity here; I've been doing some unexpected traveling. The KOTD did not work for me when I first posted this, but this weekend I will try again with the latest KOTD, and I will also try bigsmp. Greg, I think the only way I might possibly find an OOPS message is to set up a serial console, but it hangs so early that I'm even not sure whether that will work. If neither the newer KOTD or the bigsmp kernels work for me, I guess I'll start trying to track down a suitable serial cable. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #17 from fkamogee@yahoo.com 2007-01-13 08:49 MST ------- Neither kernel-default-2.6.18.5-SL102_BRANCH_20070111163922.i586.rpm nor kernel-bigsmp-2.6.18.5-SL102_BRANCH_20070111163922.i586.rpm work for me. It sounds like those who have had success with KOTD were using the head branch? I don't see any binaries there so I can't try it real quick. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 fkamogee@yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|fkamogee@yahoo.com | ------- Comment #18 from fkamogee@yahoo.com 2007-01-13 15:11 MST ------- I haven't been able to track down a serial cable. I'd have to order one online, and I'm not sure it's worth the cost to me, as I've promised myself I would not put any more money into this machine. I'm clearing the NEEDINFO. If you decide that's the only hope of figuring this out, you can set it back... but I don't think we are there yet. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #19 from fkamogee@yahoo.com 2007-01-13 15:13 MST ------- Can anyone else that can reproduce this bug get an OOPS for us? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #20 from fkamogee@yahoo.com 2007-01-14 09:34 MST -------
From the HEAD branch, kernel-default-2.6.20_rc5-20070113193557.i586.rpm works for me.
I'm still willing to test any idea that'll help get the 10.2 branch fixed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #21 from ludek@dolejsky.com 2007-01-14 10:12 MST ------- Yes, I confirm that kernel-default-2.6.20_rc5-20070113193557.i586.rpm from kotd works for me too. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #22 from fkamogee@yahoo.com 2007-01-15 08:24 MST ------- Just to let you know, during the evenings this week, I will be systematically excluding patches from 2.6.18.2 and rebuilding, to see if I can identify which causes the hang. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #23 from fkamogee@yahoo.com 2007-01-16 20:35 MST ------- Here's something I found surprising: Even with only one CPU physically in the system, it still hangs. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #24 from fkamogee@yahoo.com 2007-01-17 06:40 MST ------- Alright folks, I'm quite confident (but have not proven exhaustively... maybe next week) that for me, the hang is caused by the patch called patches.arch/i386-apic-auto IN CONJUNCTION WITH use of the kernel parameter nolapic. I am posting this while running the distributed 2.6.18.2-34 binary with kernel parameter noapic but NOT nolapic and NOT nosmp/maxcpus=0/maxcpus=1. At this time, I still have only one processor physically plugged in. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |trenn@novell.com ------- Comment #25 from trenn@novell.com 2007-01-18 09:17 MST ------- Can you post dmidecode output of your machine pls. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 kai@kaishome.de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |kai@kaishome.de ------- Comment #26 from kai@kaishome.de 2007-01-18 16:58 MST ------- I can confirm this bug. It freezes at a line "NET: Registered protocol family 2" with no more info printed. My kernel append line is: root=/dev/hda8 vga=0x317 splash=silent resume=/dev/hda6 acpismp=force apm=power-off showopts Appending "nosmp" makes the system boot for now, so I leave it until this bug is fixed. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #27 from kai@kaishome.de 2007-01-18 17:03 MST ------- BTW: Which package contains dmidecode? I'd like to contribute to this bug... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #28 from trenn@novell.com 2007-01-19 04:02 MST ------- It's in pmtools and should be in default installation. It needs to be run as root. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |fkamogee@yahoo.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 fkamogee@yahoo.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|fkamogee@yahoo.com | ------- Comment #29 from fkamogee@yahoo.com 2007-01-19 18:36 MST ------- Created an attachment (id=114008) --> (https://bugzilla.novell.com/attachment.cgi?id=114008&action=view) dmidump output -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #30 from kai@kaishome.de 2007-01-20 03:56 MST ------- I have no default installation - it's pretty minimalistic and I use only smart package manager to upgrade. ;-) Well, first point: I can confirm that kernel-bigsmp boots without freezing and also shows me both cpu's in /proc/cpuinfo. So that works for now. I'll attach my dmidump soon... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #31 from kai@kaishome.de 2007-01-20 03:58 MST ------- Created an attachment (id=114025) --> (https://bugzilla.novell.com/attachment.cgi?id=114025&action=view) dmidump output -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #32 from kai@kaishome.de 2007-01-20 04:05 MST ------- Created an attachment (id=114028) --> (https://bugzilla.novell.com/attachment.cgi?id=114028&action=view) hwinfo output I also attach my hwinfo dump for completeness reasons... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 lmb@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel- |trenn@novell.com |maintainers@forge.provo.nove| |ll.com | Status|ASSIGNED |NEW -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED ------- Comment #33 from trenn@novell.com 2007-02-28 03:04 MST ------- If this is a Pentium M: Does the boot parameter max_cstate=1 help? If yes, this should be fixed in next update kernel, but I can point you to kernel to test and verify before it's coming out. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |fkamogee@yahoo.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #34 from kai@kaishome.de 2007-02-28 04:46 MST ------- (In reply to comment #33)
If this is a Pentium M: Does the boot parameter max_cstate=1 help? If yes, this should be fixed in next update kernel, but I can point you to kernel to test and verify before it's coming out.
For me it is a Asus P2B based dual Pentium-2 board... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |charbs@operamail.com ------- Comment #35 from trenn@novell.com 2007-02-28 06:33 MST ------- *** Bug 229217 has been marked as a duplicate of this bug. *** -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 ------- Comment #36 from fkamogee@yahoo.com 2007-02-28 19:10 MST ------- For me it is a dual Pentium-III (Abit VP6) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=231205 trenn@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|fkamogee@yahoo.com | Resolution| |DUPLICATE ------- Comment #37 from trenn@novell.com 2007-03-05 11:06 MST ------- *** This bug has been marked as a duplicate of bug 232013 *** -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com