[opensuse-virtual] Intermittent crash on reboot of opensuse 13.2 + Xen 4.5 ?
I can't reproduce this on-demand. It doesn't happen every time. But rebooting a running Xen instance does crash - occasionally - and when it does, it reports this in serial console ... (XEN) [2015-04-08 16:15:05] irq.c:2120: dom0: forcing unbind of pirq 201 (XEN) [2015-04-08 16:15:05] irq.c:2120: dom0: forcing unbind of pirq 202 (XEN) [2015-04-08 16:15:05] irq.c:2120: dom0: forcing unbind of pirq 203 (XEN) [2015-04-08 16:15:05] irq.c:2120: dom0: forcing unbind of pirq 190 [38366.449320] e1000e: EEE TX LPI TIMER: 00000011 [38366.468732] reboot: Restarting system (XEN) [2015-04-08 16:15:05] Domain 0 shutdown: rebooting machine. (XEN) [2015-04-08 16:15:06] ----[ Xen-4.5.0_03-363 x86_64 debug=n Not tainted ]---- (XEN) [2015-04-08 16:15:06] CPU: 0 (XEN) [2015-04-08 16:15:06] RIP: e008:[<000000009e6c4000>] 000000009e6c4000 (XEN) [2015-04-08 16:15:06] RFLAGS: 0000000000010247 CONTEXT: hypervisor (XEN) [2015-04-08 16:15:06] rax: 000000009e670340 rbx: 0000000000000000 rcx: 0000000000000000 (XEN) [2015-04-08 16:15:06] rdx: 0000000000000000 rsi: 0000000000000000 rdi: 0000000000000000 (XEN) [2015-04-08 16:15:06] rbp: 0000000000000000 rsp: ffff82d080457dd0 r8: 0000000000000000 (XEN) [2015-04-08 16:15:06] r9: 0000000000000000 r10: 0000000000000000 r11: 0000000000000008 (XEN) [2015-04-08 16:15:06] r12: 0000000000000000 r13: 0000000000000061 r14: 00000000fee1dead (XEN) [2015-04-08 16:15:06] r15: 0000000000000000 cr0: 0000000080050033 cr4: 00000000001526f0 (XEN) [2015-04-08 16:15:06] cr3: 00000008459d5000 cr2: 000000009e6c4000 (XEN) [2015-04-08 16:15:06] ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) [2015-04-08 16:15:06] Xen stack trace from rsp=ffff82d080457dd0: (XEN) [2015-04-08 16:15:06] 000000009efe42f6 00000000fee1dead ffff82d0802278a4 efff00000000000a (XEN) [2015-04-08 16:15:06] ffff82d080262000 00000008459d5000 ffff82d080227a7a 000000069ed2a000 (XEN) [2015-04-08 16:15:06] 0000000000000000 0000000000152670 0000000000000700 0000000000000061 (XEN) [2015-04-08 16:15:06] 0000000000000000 00000000fffffffe ffff82d08018477c ffff82d080457e98 (XEN) [2015-04-08 16:15:06] 0000000080457e58 000036363338335b 0000000000000000 0000000000000001 (XEN) [2015-04-08 16:15:06] 0000000000000001 ffff830845989000 ffff830845989138 00000000fee1dead (XEN) [2015-04-08 16:15:06] ffff82d080129c99 0000000000a0fb00 ffff82d080105721 ffffffffffffffff (XEN) [2015-04-08 16:15:06] 0000000000000000 0000000028121969 ffffffff80a1f100 0000000000000002 (XEN) [2015-04-08 16:15:06] ffff82d080128d3f 000000010007c000 ffff83009e786000 0000000000000029 (XEN) [2015-04-08 16:15:06] ffffffff80ab47e0 ffff8800bea23b1c ffff83009e786000 0000000028121969 (XEN) [2015-04-08 16:15:06] ffff82d080224119 0000000000000000 0000000000000147 0000000000000008 (XEN) [2015-04-08 16:15:06] 0000000000000000 0000000028121969 0000000000000000 0000000000000282 (XEN) [2015-04-08 16:15:06] 0000000000000064 0000000000000523 0000000000000000 000000000000001d (XEN) [2015-04-08 16:15:06] ffffffff800113aa 0000000000000001 ffff8800a9f93e1c 0000000000000002 (XEN) [2015-04-08 16:15:06] 0001010000000000 ffffffff800113aa 000000000000e033 0000000000000282 (XEN) [2015-04-08 16:15:06] ffff8800a9f93df8 000000000000e02b 0000000000000000 0000000000000000 (XEN) [2015-04-08 16:15:06] 0000000000000000 0000000000000000 0000000000000000 ffff83009e786000 (XEN) [2015-04-08 16:15:06] 0000000000000000 0000000000000000 (XEN) [2015-04-08 16:15:06] Xen call trace: (XEN) [2015-04-08 16:15:06] [<000000009e6c4000>] 000000009e6c4000 (XEN) [2015-04-08 16:15:06] [<ffff82d0802278a4>] efi_rs_enter+0xf4/0x110 (XEN) [2015-04-08 16:15:06] [<ffff82d080227a7a>] efi_reset_system+0x3a/0x60 (XEN) [2015-04-08 16:15:06] [<ffff82d08018477c>] machine_restart+0xcc/0x220 (XEN) [2015-04-08 16:15:06] [<ffff82d080129c99>] hwdom_shutdown+0x89/0x90 (XEN) [2015-04-08 16:15:06] [<ffff82d080105721>] domain_shutdown+0xf1/0x100 (XEN) [2015-04-08 16:15:06] [<ffff82d080128d3f>] do_sched_op+0x1af/0x440 (XEN) [2015-04-08 16:15:06] [<ffff82d080224119>] syscall_enter+0xa9/0xae (XEN) [2015-04-08 16:15:06] (XEN) [2015-04-08 16:15:06] Pagetable walk from 000000009e6c4000: (XEN) [2015-04-08 16:15:06] L4[0x000] = 00000008459d4063 ffffffffffffffff (XEN) [2015-04-08 16:15:07] L3[0x002] = 000000008c674063 ffffffffffffffff (XEN) [2015-04-08 16:15:07] L2[0x0f3] = 000000009e5ff063 ffffffffffffffff (XEN) [2015-04-08 16:15:07] L1[0x0c4] = 0000000000000000 ffffffffffffffff (XEN) [2015-04-08 16:15:07] (XEN) [2015-04-08 16:15:07] **************************************** (XEN) [2015-04-08 16:15:07] Panic on CPU 0: (XEN) [2015-04-08 16:15:07] FATAL PAGE FAULT (XEN) [2015-04-08 16:15:07] [error_code=0010] (XEN) [2015-04-08 16:15:07] Faulting linear address: 000000009e6c4000 (XEN) [2015-04-08 16:15:07] **************************************** (XEN) [2015-04-08 16:15:07] (XEN) [2015-04-08 16:15:07] Manual reset required ('noreboot' specified) Is this a known or obvious problem? I'll try to figure out how to reliably reproduc it. LT -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
I can't reproduce this on-demand. It doesn't happen every time.
After latest updates to rpm -qa | grep -i xen | sort grub2-x86_64-xen-2.02~beta2-20.5.1.x86_64 kernel-xen-3.19.3-2.1.ga30f81d.x86_64 xen-4.5.0_03-363.1.x86_64 xen-libs-4.5.0_03-363.1.x86_64 xen-tools-4.5.0_03-363.1.x86_64 exec'ing `shutdown -r now` from uname -rm 3.19.3-2.ga30f81d-xen x86_64 the above crash has happened *every* time. LT -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
<lyndat3@your-mail.com> 04/08/15 9:19 PM >>> I can't reproduce this on-demand. It doesn't happen every time.
After latest updates to ... the above crash has happened *every* time.
And it looked suspicious that it was intermittent before. Anyway - likely a firmware problem (for which upstream we recently added a workaround), but impossible to tell for sure without a full hypervisor log at at least "info" log level. Jan -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
And it looked suspicious that it was intermittent before. Anyway - likely a firmware problem (for which upstream we recently added a workaround), but impossible to tell for sure without a full hypervisor log at at least "info" log level.
What I posted was output with loglvl & guest_loglvl already at "=all". Isn't "all" more verbose already than "info"? Re the fix, which upstream was that - Xen or kernel? Do you have a commit reference? LT -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
I created Bug 926594 - Xen4.5+kernel-xen- 3.19.3-3.1 PANIC on every reboot, "Panic on CPU 0: FATAL PAGE FAULT" https://bugzilla.suse.com/show_bug.cgi?id=926594 I'll move this discussion there from now on, it's more appropriate. LT -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
On 12.04.15 at 17:58, <lyndat3@your-mail.com> wrote: And it looked suspicious that it was intermittent before. Anyway - likely a firmware problem (for which upstream we recently added a workaround), but impossible to tell for sure without a full hypervisor log at at least "info" log level.
What I posted was output with loglvl & guest_loglvl already at "=all". Isn't "all" more verbose already than "info"?
As you had posted only a log fragment, I wasn't able to tell. But yes, "all" is more verbose than "info".
Re the fix, which upstream was that - Xen or kernel? Do you have a commit reference?
http://xenbits.xen.org/gitweb/?p=xen.git;a=commitdiff;h=c643fb110a51693e82a3... Jan -- To unsubscribe, e-mail: opensuse-virtual+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-virtual+owner@opensuse.org
participants (3)
-
Jan Beulich
-
Jan Beulich
-
lyndat3@your-mail.com