[Bug 851244] New: kernel BUG at /home/abuild/rpmbuild/BUILD/kernel-xen-3.11.6/linux-3.11/arch/x86/mm/hypervisor.c:652!
https://bugzilla.novell.com/show_bug.cgi?id=851244 https://bugzilla.novell.com/show_bug.cgi?id=851244#c0 Summary: kernel BUG at /home/abuild/rpmbuild/BUILD/kernel-xen-3.11.6/linux-3. 11/arch/x86/mm/hypervisor.c:652! Classification: openSUSE Product: openSUSE 13.1 Version: Final Platform: x86-64 OS/Version: openSUSE 13.1 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: dieter@bloms.de QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/31.0.1650.57 Safari/537.36 Hi, I used opensuse 12.3 in the past as domU and it worked stable. Now after I switched to opensuse 13.1 my domU stopped working after about one day. I saw this message on console (xl console domU). This happend 2 times for me till now, so I think there must be a bug. [86408.298716] ------------[ cut here ]------------ [86408.298742] kernel BUG at /home/abuild/rpmbuild/BUILD/kernel-xen-3.11.6/linux-3.11/arch/x86/mm/hypervisor.c:652! [86408.298749] invalid opcode: 0000 [#1] SMP [86408.298757] Modules linked in: nfsd lockd nfs_acl auth_rpcgss sunrpc oid_registry sg dm_mod autofs4 scsi_dh_emc scsi_dh_rdac scsi_dh_hp_sw scsi_dh_alua scsi_dh xenblk cdrom xennet [86408.298793] CPU: 0 PID: 6822 Comm: mandb Not tainted 3.11.6-4-xen #1 [86408.298798] Hardware name: Xen 4.3.1 PV guest [86408.298803] task: ffff8800186de700 ti: ffff880001920000 task.ti: ffff880001920000 [86408.298808] RIP: e030:[<ffffffff800293e9>] [<ffffffff800293e9>] xen_pgd_pin+0x1a9/0x1d0 [86408.298823] RSP: e02b:ffff880001921dc0 EFLAGS: 00010282 [86408.298827] RAX: ffffffffffffffea RBX: ffff88000193b000 RCX: 000000000000c9d8 [86408.298832] RDX: 0000000000000000 RSI: 0000000000000002 RDI: ffff880001921dc0 [86408.298836] RBP: ffff88002da0b400 R08: 0000000000149fb6 R09: ffff88008193b000 [86408.298840] R10: 0000000000007ff0 R11: 0000000000000001 R12: 0000000000000000 [86408.298844] R13: ffff8800011a0e48 R14: ffff88002d83e608 R15: ffff88002d83e618 [86408.298857] FS: 00007f08279a7700(0000) GS:ffff88002ec00000(0000) knlGS:0000000000000000 [86408.298862] CS: e033 DS: 0000 ES: 0000 CR0: 000000008005003b [86408.298866] CR2: 00007f0826438b00 CR3: 0000000001983000 CR4: 0000000000000660 [86408.298871] Stack: [86408.298874] ffff880000000003 0000000000149fb6 0000000000000000 ffff880000000003 [86408.298883] 0000000000149fbd ffff88002d83e618 ffffffff8002522b ffff88002da0b400 [86408.298892] ffffffff80025777 ffff8800015187c0 ffffffff8002db99 ffff88002d83e630 [86408.298901] Call Trace: [86408.298933] [<ffffffff8002522b>] __pgd_pin+0x1b/0x60 [86408.298943] [<ffffffff80025777>] mm_pin+0x27/0x40 [86408.298951] [<ffffffff8002db99>] dup_mm+0x379/0x5e0 [86408.298961] [<ffffffff8002f08b>] copy_process.part.38+0x125b/0x13a0 [86408.298973] [<ffffffff8002f344>] do_fork+0xa4/0x340 [86408.298984] [<ffffffff8052559f>] stub_clone+0x3f/0x50 [86408.298997] [<00007f0826408675>] 0x7f0826408674 [86408.299001] Code: 39 d0 73 22 48 8b 15 5f 61 9e 00 48 b9 ff ff ff ff ff ff ff 7f 48 23 0c c2 48 89 c8 48 89 44 24 20 e9 7c ff ff ff e8 aa a6 4e 00 <0f> 0b b9 f0 7f 00 00 31 d2 be 02 00 00 00 48 89 e7 e8 01 f2 ff [86408.299087] RIP [<ffffffff800293e9>] xen_pgd_pin+0x1a9/0x1d0 [86408.299094] RSP <ffff880001921dc0> [86408.299119] ---[ end trace bb521754034fb69d ]--- My Dom0 is an alpinelinux with xen 4.3.1 installed, which was fine with the opensuse 12.3 domUs. My motherboard is this one http://www.asus.com/Motherboards/E45M1M_PRO/#overview, based on an amd e450 CPU. If you need more infos, please ask for it Reproducible: Always Steps to Reproduce: Install opensuse 13.1 in a domU and let it run for one or two days. For me it looks like, when there is a little trafic (for me copy files with rsync over network) Actual Results: System freeze Expected Results: no system freeze ;) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Tomas Chvatal
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Charles Arnold
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c1
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c2
Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c3
--- Comment #3 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c4
--- Comment #4 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c5
--- Comment #5 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c6
--- Comment #6 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c7
Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c8
--- Comment #8 from Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c9
Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c10
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c11
Martin Pluskal
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c12
--- Comment #12 from Martin Pluskal
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c13
--- Comment #13 from Martin Pluskal
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c14
--- Comment #14 from Martin Pluskal
From the above it looks relatively random (many apparently successful VM runs);
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c15
--- Comment #15 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c16
--- Comment #16 from Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c17
--- Comment #17 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c18
--- Comment #18 from Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c19
--- Comment #19 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c20
--- Comment #20 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c21
--- Comment #21 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c22
--- Comment #22 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c23
--- Comment #23 from Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c24
--- Comment #24 from Sebastian Buntin
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c25
--- Comment #25 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c26
--- Comment #26 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c27
--- Comment #27 from Nathan Hallquist
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c28
--- Comment #28 from Martin Pluskal
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c29
--- Comment #29 from Jan Beulich
Can all of you please stop adding redundant information (like all the same logs over and over again) here? And _new_ information is certainly appreciated. I have encountered issue during installation (every second installation in
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c30
--- Comment #30 from Martin Pluskal
Also, can everyone adding text attachments please make sure they're marked as such, so someone wanting to look at them doesn't have to first change their MIME types?
Thank you.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c31
--- Comment #31 from Jan Beulich
I have encountered issue during installation (every second installation in paravirtualized xen of openSUSE-13.2-M0 triggers it (where host is either SLE-12 or openSUSE-13.2-M0))
You seem to be the only one to trigger it this frequently and reliably. That's even more so considering that SLE12's kernel is building on the same set of patches without having a known fix for such an issue, and we'd surely have seen reports there by now if the issue was that prominent.
I have provided you with dmesg from guest and since I am using libvirtd-xen corresponding log.
Host kernel log does not contain anything even remotely out of ordinary.
Would you please be so kind and tell me what do you need me to provide?
I didn't say there is anything specific, I merely asked to stop adding redundant information. The one crucial bit of new information would be if someone went and debugged this, and found the cause. Remember - openSUSE is a community project, so I shouldn't be the only one trying to find a solution to this.
Also this issue is not very difficult for you to reproduce by yourself (just start installation of openSUSE-13.2-M0 in paravirtualized xen, assign two or more cpu's and you will most likely run into this issue yourself.
And I'll obviously try to (but don't expect to be able to see it that easily) once I find time to look into this. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c32
--- Comment #32 from Jan Beulich
The one crucial bit of new information would be if someone went and debugged this, and found the cause. Remember - openSUSE is a community project, so I shouldn't be the only one trying to find a solution to this.
Just to aid with this (i.e. to tell where the page is still being mapped writably, albeit I think I can guess where it is, I just don't know why it is there, but confirmation would certainly be desirable), if anyone cares (patch should apply without significant changes to earlier hypervisor versions). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c33
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c34
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c35
Dion Kant
You seem to be the only one to trigger it this frequently and reliably. That's even more so considering that SLE12's kernel is building on the same set of patches without having a known fix for such an issue, and we'd surely have seen reports there by now if the issue was that prominent.
I think too much people out there are thinking along the line "Am I stupid that I encounter this bug, search bugzilla and find no others reporting about this and not submitting a report about it" I belong to this class of people myself. To work around this bug I did start installing VMs based on Debian i.s.o. openSUSE based domains. My problem solved, but it does not help the quality of openSUSE releases. Of course, it is the community to blame for not filing reports. However, I am wondering why this, on some systems so reproducible bug is hardly to reproduce on systems you are using for testing. It may be wise to look a bit further into this. Anyway I decided to start filing bug reports about bugs I encounter more frequently into bugzilla and start already with testing early in the release process. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=851244
https://bugzilla.novell.com/show_bug.cgi?id=851244#c36
--- Comment #36 from Swamp Workflow Management
http://bugzilla.novell.com/show_bug.cgi?id=851244
Swamp Workflow Management
participants (1)
-
bugzilla_noreply@novell.com