[Bug 666423] New: hypervisor/Dom0 crashes under high load on OBS workers
https://bugzilla.novell.com/show_bug.cgi?id=666423 https://bugzilla.novell.com/show_bug.cgi?id=666423#c0 Summary: hypervisor/Dom0 crashes under high load on OBS workers Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Xen AssignedTo: jbeulich@novell.com ReportedBy: adrian@novell.com QAContact: qa@suse.de CC: jdouglas@novell.com Found By: --- Blocker: --- Some build hosts (build3x), which did run stable with openSUSE 11.1 crash within hours. The local directory ~adrian/4jan/ contains two log files with hypervisor traces and also all files from /boot/ directory of such a build host. please tell if you need any other informations. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c1
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c2
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c3
--- Comment #3 from Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c4
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c5
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c6
--- Comment #6 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c7
--- Comment #7 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c8
--- Comment #8 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c9
--- Comment #9 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c10
--- Comment #10 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c11
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c12
--- Comment #12 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c13
--- Comment #13 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c14
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c15
--- Comment #15 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c16
--- Comment #16 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c17
--- Comment #17 from Jeff Mahoney
The conflict comes from the drop of kernel-xen in Kernel:HEAD project (together with .38 update). But this is still defined as the Factory devel project, I don't think you still want to update to .38 for suse 11.4 ?
I can use the old sources from factory instead as base ....
The Kernel:openSUSE-11.4 project should be used instead. I've just filed a devel project change request to adjust that until 11.4 is released. SR 60607 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c18
--- Comment #18 from Jan Beulich
loglevel options (as asked by Jan) are btw:
loglvl=all guest_loglvl=all vga=text-80x50,keep loglvl=debug debug=y
"loglvl=all guest_loglvl=all vga=text-80x50,keep" makes sense, I don't think I ever asked for the other two. Furthermore these affect only the hypervisor's verbosity, the kernel's is being controlled elsewhere (and can be overridden by adding "ignore_loglevel" to the kernel [not Xen] command line). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c19
--- Comment #19 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c20
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c21
--- Comment #21 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c22
--- Comment #22 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c23
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c24
--- Comment #24 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c25
--- Comment #25 from Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c26
--- Comment #26 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c27
Stephan Kulow
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c28
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c29
--- Comment #29 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c30
--- Comment #30 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c31
--- Comment #31 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c32
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c33
--- Comment #33 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c34
--- Comment #34 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c35
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c36
--- Comment #36 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c37
Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c38
--- Comment #38 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c39
--- Comment #39 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c40
--- Comment #40 from Miklos Szeredi
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c41
--- Comment #41 from Jan Kara
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c42
Leonardo Chiquitto
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c43
Jiri Slaby
Master branch will get it from upstream.
I pushed it to stable. But I think you should push it to master too. In case it gets lost again, we will be reminded later. And also there is no reason to have factory users affected by this bug. Or not? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c44
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c45
--- Comment #45 from Jan Kara
(In reply to comment #41)
Master branch will get it from upstream.
I pushed it to stable. But I think you should push it to master too. In case it gets lost again, we will be reminded later. And also there is no reason to have factory users affected by this bug. Or not? Well, it was not really easy to trigger so I thought I save myself 3 minutes of work ;). Pushed it to master branch now as well.
BTW, I also track this patch in my git tree so I will be reminded it was not merged... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c46
Jan Kara
The new kernel may improve stability a bit (just a feeling). However, I still have a number of frozen system after running it since a week. Frozen means that I am not able to get any trace via sysrq. I still have no idea how to debug this. (This is most likely not a filesystem issue, right ?) Frozen system without possibility of sysrq does not sound like a fs issue since it indicates that interrupts are probably disabled. It could still be some AIO issue since that does complex stuff in io completion path which partly happens from the interrupt but unless we get at least some hint where the system freezes, it's just wild guessing. It could as well be a HW issue. Any chances of triggering an NMI induced kdump?
I guess file a separate bug so that it does not mix with this one which is hopefully resolved. I'll close this bug now, please reopen if you get some trace pointing to a problem in end_writeback() again. Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c47
Marcus Meissner
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c48
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c49
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c50
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c51
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c52
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c53
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c54
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c55
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c56
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c57
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c58
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=666423
https://bugzilla.novell.com/show_bug.cgi?id=666423#c59
Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com