[Bug 662317] New: DomU crashes under high load on OBS workers
https://bugzilla.novell.com/show_bug.cgi?id=662317 https://bugzilla.novell.com/show_bug.cgi?id=662317#c0 Summary: DomU crashes under high load on OBS workers Classification: openSUSE Product: openSUSE 11.4 Version: Factory Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: jdouglas@novell.com ReportedBy: adrian@novell.com QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b8) Gecko/20101214 Firefox/4.0b8 We were about to switch the OBS build hosts, using the current kernel from factory (rc7). We observe sometimes build errors with following error lines at random times: [15180.704185] Stack: [15180.704198] Call Trace: [15180.704263] Code: 00 e8 c3 f4 40 00 0f ae f0 4c 89 f7 e8 e8 da f9 ff 80 7c 24 0f 00 0f 84 d4 fe ff ff f6 43 20 01 0f 84 ca fe ff ff 0f 1f 00 f3 90 <f6> 43 20 01 75 f8 e9 ba fe ff ff 0f 1f 00 4c 89 e2 4c 89 ee 89 [15264.704034] BUG: soft lockup - CPU#5 stuck for 67s! [kworker/5:1:178] [15264.704119] Stack: [15264.704131] Call Trace: [15264.704204] Code: 40 00 0f ae f0 4c 89 f7 e8 e8 da f9 ff 80 7c 24 0f 00 0f 84 d4 fe ff ff f6 43 20 01 0f 84 ca fe ff ff 0f 1f 00 f3 90 f6 43 20 01 <75> f8 e9 ba fe ff ff 0f 1f 00 4c 89 e2 4c 89 ee 89 ef e8 e3 cd [15348.704028] BUG: soft lockup - CPU#5 stuck for 67s! [kworker/5:1:178] [15348.704107] Stack: [15348.704119] Call Trace: [15348.704188] Code: 00 e8 c3 f4 40 00 0f ae f0 4c 89 f7 e8 e8 da f9 ff 80 7c 24 0f 00 0f 84 d4 fe ff ff f6 43 20 01 0f 84 ca fe ff ff 0f 1f 00 f3 90 <f6> 43 20 01 75 f8 e9 ba fe ff ff 0f 1f 00 4c 89 e2 4c 89 ee 89 [15432.704111] BUG: soft lockup - CPU#5 stuck for 67s! [kworker/5:1:178] [15432.704195] Stack: [15432.704208] Call Trace: [15432.704280] Code: 00 e8 c3 f4 40 00 0f ae f0 4c 89 f7 e8 e8 da f9 ff 80 7c 24 0f 00 0f 84 d4 fe ff ff f6 43 20 01 0f 84 ca fe ff ff 0f 1f 00 f3 90 <f6> 43 20 01 75 f8 e9 ba fe ff ff 0f 1f 00 4c 89 e2 4c 89 ee 89 I have no way to trigger it reproducable yet, but it seems to happen every ~ 15 minutes on some host. What other information can I provide ? Reproducible: Always Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c1
--- Comment #1 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c2
--- Comment #2 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c3
--- Comment #3 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c4
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c5
--- Comment #5 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c6
--- Comment #6 from Greg Kroah-Hartman
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c7
--- Comment #7 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c8
--- Comment #8 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c9
Jan Beulich
I have no way to trigger it reproducable yet, but it seems to happen every ~ 15 minutes on some host. What other information can I provide ?
SysRq-t output from affected guest(s) (assuming these are pv guests, "xm sysrq" should be usable for this as long as the guest does not hang completely). If the guest hangs completely or can be observed to spin on one or more CPUs (observable through "xm vcpu-list"), xenctx run against the corresponding vCPU-s would likely provide further insight. If the guest hangs/spins inside Xen (which is unlikely, as you didn't say the whole box got locked), 'd' sent from the serial console would be the most likely thing to provide further information. As to there not being unusual output from "xm dmesg" - are you running with "loglvl=all guest_loglvl=all" on the Xen command line? In any case, the most helpful thing of course would be to have a way to reproduce this. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c10
--- Comment #10 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c11
--- Comment #11 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c12
--- Comment #12 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c13
--- Comment #13 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c14
--- Comment #14 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c14
--- Comment #14 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c16
--- Comment #16 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c17
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c18
--- Comment #18 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c19
--- Comment #19 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c20
--- Comment #20 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c21
--- Comment #21 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c22
--- Comment #22 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c23
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c24
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c25
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c26
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c27
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c28
--- Comment #28 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c29
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c30
--- Comment #30 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c31
--- Comment #31 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c32
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c33
--- Comment #33 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c34
--- Comment #34 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c35
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c36
--- Comment #36 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c37
--- Comment #37 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c38
--- Comment #38 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c39
--- Comment #39 from Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c
Ihno Krumreich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c40
--- Comment #40 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c41
Adrian Schröter
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c42
Jan Beulich
Our build1x are running XEN with the kernel from openSUSE:Tools:Unstable and they are in general relativ stable, but still freezing from time to time (2 out of 10 systems once a week). I get no trace/reaction on local console in that situation, but I have not tried with a serial console yet. build1x systems did run stable with SLES SP1 kernel before.
And this isn't the CPU frequency scaling hang problem we're tracking elsewhere?
Do you want to keep this open ?
Without information to work with I don't see a reason to. But I was actually trying to find out from you... -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=662317
https://bugzilla.novell.com/show_bug.cgi?id=662317#c43
Jan Beulich
participants (1)
-
bugzilla_noreply@novell.com