[Bug 660464] New: complete system freeze regression
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c0 Summary: complete system freeze regression Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: x86 OS/Version: Linux Status: NEW Severity: Critical Priority: P5 - None Component: Basesystem AssignedTo: kasievers@novell.com ReportedBy: bwiedemann@novell.com QAContact: qa@suse.de Found By: System Test Blocker: --- openQA testing has shown complete system freeze early in booting or sometimes after install in 32-bit installs. http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0963 http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0964 http://openqa.opensuse.org/results/openSUSE-NET-i586-Build0964-lxde How To Reproduce: 1. qemu-kvm -m 1000 -cdrom factory/iso/openSUSE-NET-i586-Build0964-Media.iso 2. (maybe optional) on the boot prompt add nohz=off 3. optionally use F3 to select text mode to see console messages 4. press return to boot Actual Results: Boot will often stop after printing "
openSUSE installation program v3.5.7... <<< Starting udev..."
Expected Results: should work like yesterdays version Reproducible: Sometimes - sometimes x86_64 bit versions also showed this problem. - also happens in VirtualBox - from the test log's statuser values can be seen that it is busy-looping -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c1 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Basesystem |Kernel AssignedTo|kasievers@novell.com |kernel-maintainers@forge.pr | |ovo.novell.com --- Comment #1 from Bernhard Wiedemann <bwiedemann@novell.com> 2010-12-21 06:52:08 CET --- Now I have seen a kernel-panic on http://openqa.opensuse.org/opensuse/permanent/bug/bug660464-2.jpg So maybe it is actually a kernel-problem, that only started to be randomly triggered by something else later? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c2 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P2 - High --- Comment #2 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-04 12:28:34 CET --- http://www.linuxquestions.org/questions/slackware-14/current-randomly-timed-... discusses the very same bug. It appears to be a bug in the kernel's SCSI passthrough, triggered by udev-165 using an additional SCSI command. Tests with today's openSUSE-GNOME-LiveCD-i686-Build0988-Media.iso on KVM had it failing in 15 of 20 tries. nohz=off is not required for that. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |ASSIGNED AssignedTo|kernel-maintainers@forge.pr |jeffm@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c3 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |bwiedemann@novell.com --- Comment #3 from Jeff Mahoney <jeffm@novell.com> 2011-01-06 19:55:03 UTC --- Can you re-capture the oops but boot with panic_on_oops=1 so we can see the primary oops? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c4 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|bwiedemann@novell.com | --- Comment #4 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-07 09:34:24 CET --- Created an attachment (id=407345) --> (http://bugzilla.novell.com/attachment.cgi?id=407345) serial console log with Oops+backtrace used console=ttyS0 instead -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c5 --- Comment #5 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-08 23:10:47 CET --- I had a similar panic on my laptop (Amilo Pro 2010) with 2.6.37-rc7, but that went away when using 2.6.37 from Kernel:/HEAD so there might already be a fix. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c6 --- Comment #6 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-11 15:50:02 CET --- Created an attachment (id=407787) --> (http://bugzilla.novell.com/attachment.cgi?id=407787) serial console log with Oops+backtrace from 2.6.37-default log has one successful boot and one oops after reset, so on KVM, bug might still be there with final 2.6.37 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flag| |SHIP_STOPPER+ -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c7 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mgalbraith@novell.com --- Comment #7 from Stephan Kulow <coolo@novell.com> 2011-01-19 13:41:06 CET --- According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can reproduce it too -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c8 --- Comment #8 from Stephan Kulow <coolo@novell.com> 2011-01-19 14:53:28 CET --- Tejun has a working patch: http://marc.info/?l=linux-hotplug&m=129536338129945&w=2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c9 --- Comment #9 from Mike Galbraith <mgalbraith@novell.com> 2011-01-19 14:33:35 UTC --- (In reply to comment #7)
According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can reproduce it too
The crashes I could reproduce were cured by,.. patches.fixes/sched-cgroup-use-exit-hook-to-avoid-use-after-free-crash .which is the patch in this thread, with another hunk to prevent the exit hook from messing with a failed fork child on it's way to the grave, and thereby making autogroup diddle freed memory. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c10 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |coolo@novell.com --- Comment #10 from Stephan Kulow <coolo@novell.com> 2011-01-20 10:15:12 CET --- ok, so the other bug is fixed by #8 - if someone could push it to master asap I would be grateful -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c11 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO InfoProvider| |bwiedemann@novell.com --- Comment #11 from Jeff Mahoney <jeffm@novell.com> 2011-01-21 23:43:40 UTC --- (In reply to comment #7)
According to http://marc.info/?l=kernel-janitors&m=129378990812615&w=1 Mike can reproduce it too
No, according to that thread, Mike could produce /an/ Oops. Not /this/ Oops. I've applied the patch from comment #8 to the repo and have forced an update to Kernel:HEAD for testing. Please try a kernel with the following changelog entry and report back. ata: Fix panics with ata_id (bnc#660464). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c12 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED InfoProvider|bwiedemann@novell.com | --- Comment #12 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-25 18:46:08 CET --- No more i586 crashes on openQA in over 20 testruns since this went into Factory. Can not yet tell about i686 LiveCDs, since none were built so far. But looks good. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c13 Jeff Mahoney <jeffm@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |RESOLVED Resolution| |FIXED --- Comment #13 from Jeff Mahoney <jeffm@novell.com> 2011-01-25 17:53:41 UTC --- Thanks. I'll close as FIXED. Please re-open if the LiveCDs fail. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c14 Bernhard Wiedemann <bwiedemann@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |VERIFIED --- Comment #14 from Bernhard Wiedemann <bwiedemann@novell.com> 2011-01-29 19:08:35 CET --- Bug has not been seen again. Not even on LiveCDs. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c15 --- Comment #15 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-07 23:43:25 UTC --- Created an attachment (id=412674) --> (http://bugzilla.novell.com/attachment.cgi?id=412674) Default kernel log After update from 11.3 to 11.4-M6 (x86_64) my system (laptop hp-compaq 6720s) totally freezes on boot. It's happens almost always (~9 times of 10, roughly). In console I saw only "Creating device nodes with udev", it's all. This problem I saw in 11.3 with newer kernels (2.6.36, 2.6.37) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c16 --- Comment #16 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-07 23:45:45 UTC --- Created an attachment (id=412675) --> (http://bugzilla.novell.com/attachment.cgi?id=412675) "Failsave" parameters (apm=off noresume edd=off powersaved=off nohz=off highres=off processor.max_cstate=1 x11failsafe vga=0x317) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c17 --- Comment #17 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-07 23:47:13 UTC --- Created an attachment (id=412676) --> (http://bugzilla.novell.com/attachment.cgi?id=412676) Default kernel log + nomodeset - flood in logs by udev -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c18 --- Comment #18 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-07 23:49:09 UTC --- Created an attachment (id=412677) --> (http://bugzilla.novell.com/attachment.cgi?id=412677) System log after successeful boot (udev's flood again) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c19 Vadim Kotelnikov <vadimuzzz@inbox.ru> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|VERIFIED |REOPENED CC| |vadimuzzz@inbox.ru Resolution|FIXED | --- Comment #19 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-07 23:51:35 UTC --- Bug is here (see above). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c20 --- Comment #20 from Vadim Kotelnikov <vadimuzzz@inbox.ru> 2011-02-13 01:24:51 UTC --- 11.4 RC1 - bug is still here -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=660464 https://bugzilla.novell.com/show_bug.cgi?id=660464#c21 Stephan Kulow <coolo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|REOPENED |RESOLVED Resolution| |FIXED --- Comment #21 from Stephan Kulow <coolo@novell.com> 2011-02-15 09:32:17 CET --- sorry, this is a different bug. So please track it as a different number. Your problem is hardware specific - or it wouldn't go away with kernel parameters. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com