[Bug 251109] New: Unreliable boot
https://bugzilla.novell.com/show_bug.cgi?id=251109 Summary: Unreliable boot Product: openSUSE 10.2 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: hpkeck@web.de QAContact: qa@suse.de Since I installed 10.2 I am having serious problems with booting the system. About every second time it will crash, which means that it will freeze completely, usually shortly after printing a kernel oops message or a segmentation fault. Sometimes it seems to boot up normally but then some processes mysteriously crash (such as kdm getting a segfault when I try to log in), which will lead to a complete freeze sooner or later in most cases. At first I played around with kernel parameters but it didn't help. Now I found out that the setting of INITRD_MODULES seems to have an influence somehow. The original setting was "processor thermal sata_via via82cxxx fan jbd ext3 edd". If I change that to "ide-generic jbd ext3" the boot gets more reliable (only about 20% crashes) but the system is very slow then, of course (no DMA). If I set INITRD_MODULES to either "processor thermal sata_via ide-generic fan jbd ext3 edd" or "via82cxxx jbd ext3", the boot will be as unreliable as in the beginning again. It seems to me that none of these modules causes the problem, but that it might rather be a memory hole somewhere else in the kernel, which just shows up more or less often. My system is based on an Asus A8V mainboard (VIA chipset). I need to mention that I have been running Suse 9.3 for more than half a year on it, and also played around with Suse 10.0 and Ubuntu 6.06 but never had these massive problems I am having with 10.2. I should also mention that I already checked the RAM with Memtest86, but didn't find any problems. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 ------- Comment #1 from hpkeck@web.de 2007-03-04 01:20 MST ------- Created an attachment (id=122219) --> (https://bugzilla.novell.com/attachment.cgi?id=122219&action=view) Output of lspci -v -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 ------- Comment #2 from hpkeck@web.de 2007-03-04 01:22 MST ------- Created an attachment (id=122220) --> (https://bugzilla.novell.com/attachment.cgi?id=122220&action=view) Example of kernel oops Most of the time, the oops (if it comes) looks similar to this. The list of linked in modules and the process vary, but the EIP is usually at get_vmalloc_info with that call trace. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 markgray+to-suse@puck.nac.net changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |markgray+to-suse@puck.nac.net ------- Comment #3 from markgray+to-suse@puck.nac.net 2007-03-04 09:07 MST ------- I had almost exactly the same problem with my VIA chipset motherboard when using the 10.2 default kernel kernel-default-2.6.18.2-34 -- I downloaded 2.6.19.1 from kernel.org and compiled it myself and I have had absolutely no problems since. It may be a problem with the kernel version itself, or the patches OpenSUSE applies to it (my suspect is the xen patch -- but that is based solely on the fact that there were problems caused by the xen patch during the install of 10.2 Alpha3 (see bug #202079)). There is another 10.2 bug which I have been following which sounds very similar, but I forget the number. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 ------- Comment #4 from hpkeck@web.de 2007-03-06 12:37 MST ------- I tested the vanilla kernels 2.6.19.1 and 2.6.18.2. Both do not show the problem, booting is stable with both of them. Obviously the openSuse patches break the kernel. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 jeffm@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel- |npiggin@novell.com |maintainers@forge.provo.nove| |ll.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=251109 hpkeck@web.de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |INVALID ------- Comment #6 from hpkeck@web.de 2007-04-05 10:41 MST ------- Things turned out to be different than I thought. More testing revealed that 2.6.19.1 wasn't completely stable either, just the probability of failures was lower. So I started experimenting around and finally found out that I had I hardware problem. The main memory was flaky somehow, even though Memtest86 didn't report any problems. After lowering the memory clock speed everything works stable now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
participants (1)
-
bugzilla_noreply@novell.com