[Bug 391709] New: Error when attempting `xm save` of pv guest
https://bugzilla.novell.com/show_bug.cgi?id=391709 Summary: Error when attempting `xm save` of pv guest Product: openSUSE 11.0 Version: Beta 3 Platform: Other OS/Version: Other Status: NEW Severity: Major Priority: P5 - None Component: Xen AssignedTo: jfehlig@novell.com ReportedBy: jdouglas@novell.com QAContact: jdouglas@novell.com CC: carnold@novell.com, lbendixs@novell.com Found By: --- Created an attachment (id=216139) --> (https://bugzilla.novell.com/attachment.cgi?id=216139) xend.log I received the following error when after attempting to do an xm save on a 32-bit pv sles10sp2 guest: xen75:/ # xm save 2 /tmp/sles10sp2-32.sav Error: /usr/lib64/xen/bin/xc_save 19 2 0 0 0 failed Usage: xm save [-c] <Domain> <CheckpointFile> Save a domain state to restore later. -c, --checkpoint Leave domain running after creating snapshot Running xm list revealed that the VM was in the following state: Name ID Mem VCPUs State Time(s) Domain-0 0 14582 4 r----- 534.2 sles10sp2-32 2 1024 4 ---s-- 106.0 I tried doing an xm shutdown and that failed. I then tried an xm restore and that seemed to succeed because suddenly I had two sles10sp2-32 vms listed in xm list: xen75:/ # xm li Name ID Mem VCPUs State Time(s) Domain-0 0 13559 4 r----- 552.1 sles10sp2-32 2 1024 4 ---s-- 106.0 sles10sp2-32 5 1024 4 -b---- 0.1 I was able to login to the restored guest, and things appeared to be working as expected, so it appears as though the save and restore worked with the exception of stopping the vm upon the successful save. BTW, I am running the 64-bit hypervisor/dom0. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jdouglas@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c1
--- Comment #1 from Jason Douglas
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jfehlig@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c2
--- Comment #2 from James Fehlig
https://bugzilla.novell.com/show_bug.cgi?id=391709
User agresko@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c3
Aaron Gresko
https://bugzilla.novell.com/show_bug.cgi?id=391709
James Fehlig
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jfehlig@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c9
James Fehlig
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c10
--- Comment #10 from Jan Beulich
I've done some more investigation on this issue and I found that xc_domain_save() in tool/libxc/xc_domain_save.c mysteriously exists the xc_save process when calling munmap() during cleanup at line 1610.
Are you saying that the munmap call doesn't return (but rather result in process exit)? Not being able to read (from the debugger) the memory live_p2m points to doesn't really mean the address is bogus - as long as it's (as you verified) the same one you got from mmap(), I'd assum all is fine with it. Did you try this with a debug build, so you'd get all the DPRINTF() output? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jfehlig@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c11
--- Comment #11 from James Fehlig
Are you saying that the munmap call doesn't return (but rather result in process exit)?
Yes.
Did you try this with a debug build, so you'd get all the DPRINTF() output?
As it turns out, I didn't even need a debug build (due to a bug in tools/libxc/xc_private.h) to get the DPRINTF() outputs. Regardless, I did a debug build and added these DPRINTF()'s: --- xc_domain_save.c.orig 2008-05-28 13:25:24.000000000 -0600 +++ xc_domain_save.c 2008-05-28 13:30:01.000000000 -0600 @@ -1606,8 +1606,11 @@ if ( live_shinfo ) munmap(live_shinfo, PAGE_SIZE); - if ( live_p2m ) + if ( live_p2m ) { + DPRINTF("#### Calling munmap: live_p2m = %p\n", live_p2m); munmap(live_p2m, ROUNDUP(p2m_size * sizeof(xen_pfn_t), PAGE_SHIFT)); + DPRINTF("#### returned from munmap\n"); + } if ( live_m2p ) munmap(live_m2p, M2P_SIZE(max_mfn)); The results (in xend.log): [2008-05-28 13:22:39 5686] INFO (XendCheckpoint:374) Had 0 unexplained entries in p2m table [2008-05-28 13:22:45 5686] INFO (XendCheckpoint:374) Saving memory pages: iter 1 95%^M 1: sent 131072, skipped 0, delta 6185ms, dom0 0%, target 0%, sent 694Mb/s, dirtied 0Mb/s 0 pages [2008-05-28 13:22:45 5686] INFO (XendCheckpoint:374) Total pages sent= 131072 (0.98x) [2008-05-28 13:22:45 5686] INFO (XendCheckpoint:374) (of which 0 were fixups) [2008-05-28 13:22:45 5686] INFO (XendCheckpoint:374) All memory is saved [2008-05-28 13:22:50 5686] INFO (XendCheckpoint:374) #### Calling munmap: live_p2m = 0x7f5ff43ae000 [2008-05-28 13:23:00 5686] ERROR (XendCheckpoint:144) Save failed on domain sles-10-sp2-32-pv-def-net-7a0-1c3 (1). Traceback (most recent call last): File "/usr/lib64/python2.5/site-packages/xen/xend/XendCheckpoint.py", line 112, in save forkHelper(cmd, fd, saveInputHandler, False) File "/usr/lib64/python2.5/site-packages/xen/xend/XendCheckpoint.py", line 362, in forkHelper raise XendError("%s failed" % string.join(cmd)) XendError: /usr/lib64/xen/bin/xc_save 4 1 0 0 0 failed As you can see, I never get message "returned from munmap" - nor the final debug message (already in the code) just before returning from this function. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c12
--- Comment #12 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=391709
User syntron@web.de added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c13
Matthias Pfafferodt
https://bugzilla.novell.com/show_bug.cgi?id=391709
User jfehlig@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=391709#c14
--- Comment #14 from James Fehlig
Are there new information for this bug?
No. I wasn't able to get to the bottom of this problem before 11.0 went GM. I'll need to see if this behavior still exists in 11.1 code base.
I use opensuse 11.0 and I found a similar bug. I try to save a domain (x02-pluto). The save command exits with an error message. But restore is possible using the file. After that I have the domain in two states: 's' and 'b'.
It's not similar but exactly the behavior I was seeing and debugging. JFYI, you can safely destroy the domain in 's' state after doing the save. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com