[Bug 579842] New: opensuse VMs hang on shutdown when running on XENserver 5.5
http://bugzilla.novell.com/show_bug.cgi?id=579842 http://bugzilla.novell.com/show_bug.cgi?id=579842#c0 Summary: opensuse VMs hang on shutdown when running on XENserver 5.5 Classification: openSUSE Product: openSUSE 11.1 Version: Final Platform: x86-64 OS/Version: openSUSE 11.1 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: bnc-team-screening@forge.provo.novell.com ReportedBy: novell-web@zmi.at QAContact: qa@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; de; rv:1.9.1.7) Gecko/20091222 SUSE/3.5.7-1.1.1 Firefox/3.5.7 ZarafaCheck/1.1.1.20080624.110 We have several XENserver 5.5 machines, with completely different hardware (server and storages). All have the same effect that when an openSUSE machine with it's actual XEN kernel runs for some time (30 days or so), and then you do a reboot in that VM, it starts to shutdown and freezes. You must "force shutdown" from XEN. It happens to 10.2 and 11.1 (the only two releases we use really, just one machine is on 11.2 already), all running versions 2.6.27.29-0.1-xen to 2.6.27.42-0.1-xen. I can't tell what the problem is, just the symptom. The system is frozen then and I don't know how I could debug that. Reproducible: Sometimes Steps to Reproduce: 1. openSUSE VM running on XENserver 5.5 (with Update 1) 2. running for some days (30 or so) Actual Results: /sbin/reboot freezes on shutdown Expected Results: normal reboot ;-) -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=579842 http://bugzilla.novell.com/show_bug.cgi?id=579842#c yang xiaoyu <xyyang@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |xyyang@novell.com AssignedTo|bnc-team-screening@forge.pr |ps-wt-team@forge.provo.nove |ovo.novell.com |ll.com -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c1 Stephen Tse <stephen.tse@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |stephen.tse@novell.com AssignedTo|ps-wt-team@forge.provo.nove |bnc-team-screening@forge.pr |ll.com |ovo.novell.com --- Comment #1 from Stephen Tse <stephen.tse@novell.com> 2010-08-26 18:46:12 UTC --- Looks like this incorrectly got assigned to us. Sending to back to where it came from. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c wei wang <wewang@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wewang@novell.com AssignedTo|bnc-team-screening@forge.pr |jbeulich@novell.com |ovo.novell.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c2 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO CC|stephen.tse@novell.com | Component|Kernel |Xen Found By|--- |Community User InfoProvider| |novell-web@zmi.at --- Comment #2 from Jan Beulich <jbeulich@novell.com> 2010-09-10 08:09:08 UTC --- Is this still an issue (with up-to-date kernel in the guest), and if so, is this also an issue with 11.3 (and I think there's a newer version of XenServer available meanwhile too)? If it indeed is, the xenctx utility (to be used from Dom0) should be used to obtain information on where the guest's vCPU(s) hang(s) (assuming the guest did not suddenly die, which you would know from the xend logs). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c3 Michael Monnerie <novell-web@zmi.at> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |novell-web@zmi.at --- Comment #3 from Michael Monnerie <novell-web@zmi.at> 2010-09-20 06:46:35 UTC --- Hi Jan, I do have a hanging VM now. I found /usr/lib/xen/bin/xenctx on the Xen dom0, but how do I call it? # /usr/lib/xen/bin/xenctx usage: xenctx [options] <domid> <optional vcpu> # /usr/lib/xen/bin/xenctx --help usage: xenctx [options] <DOMAIN> [VCPU] options: -f, --frame-pointers assume the kernel was compiled with frame pointers. -s SYMTAB, --symbol-table=SYMTAB read symbol table from SYMTAB. --stack-trace print a complete stack trace. -k, --kernel-start set user/kernel split. (default 0xc0000000) -a --all display more registers So once it wants "domid" and once DOMAIN. I tried providing the UUID of the VM, but that doesn't work. So how do I find out which parameter I need to give here? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c4 --- Comment #4 from Jan Beulich <jbeulich@novell.com> 2010-09-20 07:00:41 UTC --- The domain ID is a simple number (varying with each run of the guest), as listed e.g. by "xm list". -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c5 --- Comment #5 from Michael Monnerie <novell-web@zmi.at> 2010-09-20 07:42:41 UTC --- Yes, but you talk about Xen(source), while I have XenServer 5.5. There isn't a domid, at least I don't see it. I have uuid's, example: # xe vm-list uuid ( RO) : 9ba0d669-911f-c6a3-a40a-ae64bdd5867b name-label ( RW): zarafa13 power-state ( RO): running uuid ( RO) : b7633bd2-1fd0-a73b-ffba-1d7a4a76a40e name-label ( RW): mailsrv13 power-state ( RO): running These are the VMs that are currently hanging. I tried with /usr/lib/xen/bin/xenctx 1 and that gives output, but I have no idea which VM is domid=1. So I need the connex between uuid and domid. How to do that? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c6 --- Comment #6 from Jan Beulich <jbeulich@novell.com> 2010-09-20 07:53:39 UTC --- I don't know XenServer's way of management at all, hence I have to rely on you finding out. Alternatively we would have to ask you to reproduce your problem in an environment we support (i.e. OpenSuSE or SLE host), or go through the normal support process (which afaik are mailing lists only for OpenSuSE). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c7 --- Comment #7 from Michael Monnerie <novell-web@zmi.at> 2010-09-20 08:17:02 UTC --- That's a production server and I can't simply change it. Seems we are stuck here, so I just reboot those VMs. It's a pity we couldn't solve it. BTW: The two VMs where openSUSE 11.1 with 2.6.27.48-0.2-xen and openSUSE 11.2 with 2.6.31.12-0.2-xen. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c8 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |CLOSED InfoProvider|novell-web@zmi.at | Resolution| |WONTFIX --- Comment #8 from Jan Beulich <jbeulich@novell.com> 2010-11-04 15:32:28 UTC --- Closing based on this not being reproducible for us. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=579842 https://bugzilla.novell.com/show_bug.cgi?id=579842#c9 --- Comment #9 from Michael Monnerie <novell-web@zmi.at> 2010-11-04 17:13:38 UTC --- I just had this twice again. Both systems are openSUSE 11.1, and I installed today's kernel update. Then on reboot it goes: INIT: Switching to runlevel: 6 INIT: Sending processes the TERM signal Boot logging started on /dev/hvc0(/dev/console) at Thu Nov 4 17:49:48 2010 Master Resource Control: previous runlevel: 3, switching to runlevel:6 Shutting down Nagios NRPE done Shutting down httpd2 (not running) done Shutting down memcached done Saving random seed done Shutting down SSH daemon done /etc/init.d/kbd stop done Shutting down Name Service Cache Daemon done and here the VM is stuck. I have to force reboot the VM, it works smooth afterwards. Maybe the upgraded kernel makes the VM hang on shutdown? It seems it doesn't happen on openSUSE 11.2 anymore, but 11.1 is definitely hanging. At least 2 out of 6 VMs had a hang on reboot now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com