[Bug 799475] New:=?UTF-8?Q?=20kernel=20bug=20in=20=E2=80=A6?=/mm/slab.c:3175 and subsequent deep freeze
https://bugzilla.novell.com/show_bug.cgi?id=799475 https://bugzilla.novell.com/show_bug.cgi?id=799475#c0 Summary: kernel bug in …/mm/slab.c:3175 and subsequent deep freeze Classification: openSUSE Product: openSUSE 12.2 Version: Final Platform: x86-64 OS/Version: openSUSE 12.2 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: avsco@mail.ru QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:18.0) Gecko/20100101 Firefox/18.0 Preface. I'm using openSUSE ocassionally, not on a day-by-day basis. To make things worse, my desktop has Intel ICH10R fake-RAID, so I'm not usually surprised when the mirror drops to “verify” state after some graceful openSUSE shutdown or after pressing the Reset button during a stalled shutdown, as well as when openSUSE turns into a pumpkin in several months after installation and doesn't want to start until I reinstall it. This time it looked the same, when [after applying recent updates a month ago, I guess] openSUSE stopped loading, falling back to single-user mode after failing /home mount operation, perhaps, — it's hard to tell from the intermixed systemd output. But today I looked more carefully at the previous messages in scrollback buffer, and noticed that there was actually a crash report up there. As this report was not written to any file (at least I didn't find any), I wrote down the first lines:
kernel bug in …/mm/slab.c:3175 invalid opcode: 0000 pid = mount, sig = SEGV Trace: ... __kmalloc+0x153/0x190 ... ext4_kvzalloc+0x1d/0x60 (numbers in this line are the same as in latter reports) ... ext4_fill_super+0x1556/0x2840 .....
Then it turned out that booting in failsafe mode allowed to enter KDE without a problem. And, having found no information on that specific bug, I decided to apply the current updates as well. Although no updates for kernel were available at this time, the situation has worsened. At first, half of attempts to boot in normal mode resulted in giving 3 crash reports, ending with a deep freeze; the other half still dropped to single-user mode after 1 crash report. But, after some tryouts, the 3-crash way absolutely prevailed, so I now have no other option but to photograph the screen (see the attachments). It now proceeds as this: 1. The boot process starts as usual, mounting some partitions and starting some services. 2. The original …/mm/slab.c:3175 crash occurs (at least I suppose so, because call traces look similar) and immediately goes another one, as depicted in trace01.png. 3. The computer freezes, which looks like a 100 % processor core load: generally unresponsive, but may occasionally react to scrollback keys. 4. After 45 seconds, the third crash report is displayed, as depicted in trace02.png, and the computer ultimately hangs, which looks like an idle halt. The resulting freeze is so deep that SysRq keys stop working and I have to press the Reset button. The failsafe mode still worked, though, so I finally figured out what options do the trick:
nohz=off highres=off When both are present, the system boots to KDE just fine, as it always did earlier. When any of this options is missing, then the 3-crash issue is sure to occur.
Further details. All filesystems, including the mentioned /home, are ext4 residing on LVM2 volumes, with the exception of /boot on raw partition. I did no change to disks in recent months/years, nor to other hardware (except to videocard, but I doubt it could influence a filesystem driver or timer module). Any hints on how to debug this? Reproducible: Always -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c1
--- Comment #1 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c2
--- Comment #2 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c3
--- Comment #3 from Anton Samsonov
apm=off edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 nomodeset or something like this, don't remember exactly which one of “edd” or “nomodeset” was excluded.
At this moment I could start to consider the possibility of a real hardware failure, if only openSUSE was the only one operating system on this computer. But my primary OS is Windows 7, and it boots just fine and [almost] never screws the fake-RAID on shutdown, and runs demanding modern videogames, as well as CPU-, GPU- and RAM-intensive BOINC computations that are cross-validated against other nodes. Of course, it's not a strong proof, but I'm more inclined towards a software bug, taking into account that the situation worsened each time after updating openSUSE. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c4
--- Comment #4 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c5
--- Comment #5 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c6
--- Comment #6 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c7
--- Comment #7 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c
Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c8
--- Comment #8 from Anton Samsonov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c9
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c10
--- Comment #10 from Anton Samsonov
Has your experience with 12.3 been any better?
The experience is always the same (more or less): for several months after the installation, everything works just fine, but occasionally deteriorates to an unusable state — when only a single-user prompt is available, which doesn't help as the system either crashes or hangs on transition to higher runlevels. By the time this happens, a new version of openSUSE is usually available, so, after several attempts to fix the problem, I install the new version from scratch. This gives me another few months, and the cycle repeats. If/when the same happens to openSUSE 12.3, I'll try report back, but, again, I have absolutely no idea how to provide more helpful dumps for such cases. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c11
Borislav Petkov
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c12
--- Comment #12 from Anton Samsonov
Is this issue still of interest or can we close?
I've updated to 12.3 and 13.1 since then, with relatively less usage than earlier, so problems (if they persist) didn't have much time to accumulate. Thus it may be better to close this entry and perhaps open a new one if/when necessary — against a recent openSUSE version. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=799475
https://bugzilla.novell.com/show_bug.cgi?id=799475#c13
Jeff Mahoney
participants (1)
-
bugzilla_noreply@novell.com