[Bug 607634] New: hypervisor msgs: "Error getting mfn" and "ptwr_emulate: fixing up invalid PAE PTE"
http://bugzilla.novell.com/show_bug.cgi?id=607634 http://bugzilla.novell.com/show_bug.cgi?id=607634#c0 Summary: hypervisor msgs: "Error getting mfn" and "ptwr_emulate: fixing up invalid PAE PTE" Classification: openSUSE Product: openSUSE 11.2 Version: Final Platform: x86-64 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: jdouglas@novell.com ReportedBy: koenig@linux.de QAContact: qa@suse.de Found By: --- Blocker: --- running 11.2 on my dom0 I noticed the following hypervisor msgs: (XEN) mm.c:806:d15 Error getting mfn c02fd (pfn 39cb3) from L1 entry 00000000c02fd063 for l1e_owner=15, pg_owner=15 (XEN) mm.c:4196:d15 ptwr_emulate: fixing up invalid PAE PTE 00000000c02fd063 (XEN) mm.c:806:d15 Error getting mfn 59299 (pfn 1ee13) from L1 entry 0000000059299063 for l1e_owner=15, pg_owner=15 (XEN) printk: 3 messages suppressed. (XEN) mm.c:806:d15 Error getting mfn c0316 (pfn 39c9a) from L1 entry 00000000c0316063 for l1e_owner=15, pg_owner=15 (XEN) printk: 1 messages suppressed. (XEN) mm.c:806:d15 Error getting mfn c8037 (pfn 3bf79) from L1 entry 00000000c8037063 for l1e_owner=15, pg_owner=15 (XEN) printk: 11 messages suppressed. (XEN) mm.c:806:d15 Error getting mfn c0316 (pfn 39c9a) from L1 entry 00000000c0316063 for l1e_owner=15, pg_owner=15 (XEN) printk: 9 messages suppressed. where domain 15 is a PVM domU running centos4u4. I stopped/restarted that domain but those msgs remain, now for dom17 (12 active dpmUs running...): (XEN) mm.c:806:d17 Error getting mfn 1b1fb (pfn 39987) from L1 entry 000000001b1fb063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 000000001b1fb063 (XEN) mm.c:806:d17 Error getting mfn 1b1fb (pfn 39987) from L1 entry 000000001b1fb063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 000000001b1fb063 (XEN) mm.c:806:d17 Error getting mfn 1b1fb (pfn 39987) from L1 entry 000000001b1fb063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 000000001b1fb063 (XEN) printk: 3486 messages suppressed. (XEN) mm.c:806:d17 Error getting mfn 1e9f6 (pfn 3618c) from L1 entry 000000001e9f6063 for l1e_owner=17, pg_owner=17 (XEN) printk: 15 messages suppressed. (XEN) mm.c:806:d17 Error getting mfn 57f7a (pfn 16132) from L1 entry 0000000057f7a063 for l1e_owner=17, pg_owner=17 (XEN) printk: 1 messages suppressed. (XEN) mm.c:806:d17 Error getting mfn 57f7a (pfn 16132) from L1 entry 0000000057f7a063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 0000000057f7a063 (XEN) mm.c:806:d17 Error getting mfn 57f7a (pfn 16132) from L1 entry 0000000057f7a063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 0000000057f7a063 (XEN) printk: 8 messages suppressed. (XEN) mm.c:806:d17 Error getting mfn 1e9f6 (pfn 3618c) from L1 entry 000000001e9f6063 for l1e_owner=17, pg_owner=17 (XEN) mm.c:4196:d17 ptwr_emulate: fixing up invalid PAE PTE 000000001e9f6063 (XEN) printk: 14 messages suppressed. restarting other domUs does not show those msgs for them too. that centos4u4 is up all time and I've never seen these msgs before. verions in use right now: kernel-xen-2.6.31.12-0.2.1.x86_64 kernel-xen-devel-2.6.31.12-0.2.1.x86_64 xen-3.4.1_19718_04-2.1.x86_64 xen-doc-html-3.4.1_19718_04-2.1.x86_64 xen-doc-pdf-3.4.1_19718_04-2.1.x86_64 xen-kmp-default-3.4.1_19718_04_2.6.31.5_0.1-2.1.x86_64 xen-libs-3.4.1_19718_04-2.1.x86_64 xen-libs-32bit-3.3.1_18546_20-0.1.1.x86_64 xen-tools-3.4.1_19718_04-2.1.x86_64 kernel version history from /var/log/zypp/history : 2010-04-12 16:57:02|install|kernel-xen|2.6.27.7-9.1|x86_64||openSUSE 11.1-0|9e97d0a83e4ae29f5d86d32a999ff3ac7a90d451 2010-04-13 15:25:45|install|kernel-xen|2.6.27.45-0.1.1|x86_64||update|8982b9743fd32a3697eab0ba275ce3f512757110 2010-04-22 21:20:23|install|kernel-xen|2.6.31.5-0.1.1|x86_64||oss|8dc4ae6cd00c8c6bd17110d5fe3658d6c404eb87 2010-05-10 18:56:38|install|kernel-xen|2.6.31.12-0.2.1|x86_64|root@os4|updates|d689e14ba483cee34b7376d7a1402aff822c80ae -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c
Jason Douglas
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c1
Jan Beulich
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c2
Harald Koenig
We can't fix non-SuSE guest kernels (and the issue clearly is with wrong ordered page table updates in the guest), so it's not clear to me what you expect us to do. Please clarify.
I'm not that much a XEN expert that I can judge if this is an client-only issue or a generic hypervisor problem. like with regular unix kernels "a user space process never shall be able to crash the kernel" I'd expect the hypervisor not to crash due to domU kernel issues, no matter if there is a suse or redhat kernel running in a domU. my expectations may be wrong in this case ?! first I wanted to inform you that there is this new behaviour just FYI -- maybe it's old and known, or might show up again in the future.... and since this does not show up with 11.1 dom0 (not sure of earlier 11.2/11.3 tests because I only detected this via the xen messages on the serial console which I do monitor now (only for the last few days so far...)) it might be related to dom0 xen version too (just a thought and question...) ? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c3
Jan Beulich
I'm not that much a XEN expert that I can judge if this is an client-only issue or a generic hypervisor problem. like with regular unix kernels "a user space process never shall be able to crash the kernel" I'd expect the hypervisor not to crash due to domU kernel issues, no matter if there is a suse or redhat kernel running in a domU. my expectations may be wrong in this case ?!
So far there was no mention of the hypervisor crashing. If it indeed does, please provide the log thereof (and possibly adjust the description of the bug accordingly).
first I wanted to inform you that there is this new behaviour just FYI -- maybe it's old and known, or might show up again in the future....
Yes, these messages are known (with a known guest kernel side fix).
and since this does not show up with 11.1 dom0 (not sure of earlier 11.2/11.3 tests because I only detected this via the xen messages on the serial console which I do monitor now (only for the last few days so far...)) it might be related to dom0 xen version too (just a thought and question...) ?
The presence of these messages certainly may depend on the hypervisor version you use (although the conditions checked in the 11.1 and 11.2 hypervisors look very similar), or your loglevel settings. Further, if the guest doesn't die as a consequence of this, I take this as an indication that the condition is detected correctly by Xen, and it is applying is the right workaround. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c
Ihno Krumreich
http://bugzilla.novell.com/show_bug.cgi?id=607634
http://bugzilla.novell.com/show_bug.cgi?id=607634#c4
Jan Beulich
participants (1)
-
bugzilla_noreply@novell.com