[Bug 855384] New: Kernel Bug leads to X server crash
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c0 Summary: Kernel Bug leads to X server crash Classification: openSUSE Product: openSUSE 12.3 Version: Final Platform: x86-64 OS/Version: Linux Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: mantel@suse.com QAContact: qa-bugs@suse.de Found By: Development Blocker: --- Created an attachment (id=571735) --> (http://bugzilla.novell.com/attachment.cgi?id=571735) Kernel bug message I was just copying some data to a NAS via NFS, when suddenly the X server was killed and a kernel bug message was printed to the screen. Luckily the message has been logged. See attachment. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c1 Borislav Petkov <bpetkov@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |bpetkov@suse.com, | |jack@suse.com --- Comment #1 from Borislav Petkov <bpetkov@suse.com> 2013-12-13 20:32:46 UTC --- I can see a couple of reports on the net hitting that BUG_ON but no fixes. Maybe Jan would have a better idea. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c2 --- Comment #2 from Jan Kara <jack@suse.com> 2013-12-14 22:13:19 UTC --- So I've seen one very similar report and that was tracked down to a faulty memory. The failure is on: BUG_ON(!list_empty(&bh->b_assoc_buffers)) Now ext4 doesn't use this list at all so it is indeed strange that it wouldn't be empty (it can still be some sw induced memory corruption). Can you post full dmesg or /var/log/messages from that day or something like that? Next week, I'll try to dig in disassembly of our kernels and find out whether something useful isn't left in the registers etc. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c3 Jan Kara <jack@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium Status|NEW |NEEDINFO InfoProvider| |mantel@suse.com AssignedTo|kernel-maintainers@forge.pr |jack@suse.com |ovo.novell.com | --- Comment #3 from Jan Kara <jack@suse.com> 2013-12-14 22:13:53 UTC --- Forgot to set needinfo. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c4 --- Comment #4 from Hubert Mantel <mantel@suse.com> 2013-12-16 08:07:06 UTC --- Created an attachment (id=571891) --> (http://bugzilla.novell.com/attachment.cgi?id=571891) dmesg up until the BUG -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c5 Hubert Mantel <mantel@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW InfoProvider|mantel@suse.com | --- Comment #5 from Hubert Mantel <mantel@suse.com> 2013-12-16 08:08:28 UTC --- dmesg from that day up until the BUG. As for the faulty memory: I have been running a stress test for three days and nights with all processors computing and all RAM being used. Except this incident, the machine runs stable. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c6 Jan Kara <jack@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WORKSFORME --- Comment #6 from Jan Kara <jack@suse.com> 2013-12-16 08:25:08 UTC --- Hum, nothing really useful in the registers and the log looks pretty standard as well. We know the buffer head is at address 0xffff88027b9505a8 but we don't know what its contents is (the cmp instruction in list_empty() check uses direct memory reference). Unless this is reproducible, I'm afraid I cannot debug this further. It can be just a random bitflip or some memory corruption from a driver or similar stuff. If this happens again, please reopen this bug and I'll provide you with a debug patch to dump more information. Alternatively, you could configure kdump on the machine. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c7 --- Comment #7 from Hubert Mantel <mantel@suse.com> 2014-01-09 07:01:47 UTC --- Just wanted to inform you that I can confirm this is a faulty memory issue. Will still need some days to identify the flaky module... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=855384 https://bugzilla.novell.com/show_bug.cgi?id=855384#c8 --- Comment #8 from Jan Kara <jack@suse.com> 2014-01-10 06:09:29 UTC --- Glad to hear that. Thanks for info. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com