[Bug 347416] New: OpenSuse 10.3, Xen 3.1.0, unable to install domU over http, entire physical server hangs and crashes
https://bugzilla.novell.com/show_bug.cgi?id=347416 Summary: OpenSuse 10.3, Xen 3.1.0, unable to install domU over http, entire physical server hangs and crashes Product: openSUSE 10.3 Version: Final Platform: i686 OS/Version: openSUSE 10.3 Status: NEW Keywords: dogfood Severity: Blocker Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: gb2@c5i.net QAContact: qa@suse.de Found By: Customer When running OpenSuSE 10.3 (kernel 2.6.22.13-xenpae) on a Dell PowerEdge 2950, and Xen 3.1.0, attempts to install an copy of OpenSuSE 10.3 as a domU instance while reading the source via http:// to the local dom0 host results in the entire physical server freezing at random points during the installation of packages. * This has been duplicated on four seperate Dell 2950 servers with different configurations and manufacture dates. * This occurs in both graphical (i.e. KDE/VNC) mode as well as text (i.e. xenconsole) mode, and in both runlevels 3 and 5. * Some had suspected that this might be related to bug 344877, which complains about dom0 hanging under high traffic load, and indeed it did seem to fit the profile; however, a modified kernel set to 100Hz did *not* resolve this problem. Moreover, I cannot duplicate that bug. My installations crash within 5-10 minutes of starting. But I ran "ping -f -q my.local.ip.address" for an hour with no problems: core1:/home/glen # ping -f -q my.local.ip.address PING nnn.nnn.nnn.nnn (nnn.nnn.nnn.nnn) 56(84) bytes of data. --- nnn.nnn.nnn.nnn ping statistics --- 196356167 packets transmitted, 196356167 received, 0% packet loss, time 3573391ms rtt min/avg/max/mdev = 0.010/0.010/30.037/0.003 ms, pipe 2, ipg/ewma 0.018/0.011 ms The ping flood ran successfully for hours under both the loopback address and the public-facing IPV4 address. * I am unable to install from the CD due to bugs in the vm-install python script. * My web server is just the stock apache2 server with the document root re-set to /opensuse, a directory off root which contains a copy of the installation CD. * All the troubleshooting steps in /usr/share/doc/packages/xen/README.SuSE have been followed, including booting with different kernel parameters such as pnpacpi=off, lapic, acpi=off, apm=off, noirqbalance, etc. * I also tried to grab output using sync_console, xm dmesg, etc. No log or serial output is generated, no evidence of a panic exists. The hardware... just.... stops. * Obviously, this bug prevents the use of Xen at all on any of my servers. (Plus, any bug which allows a domU instance to reach out and crash the dom0 box would certainly be a show-stopper in any event.) Please let me know how I can assist with the resolution of this bug, as we cannot move to 10.3 or use virtualization until this is corrected. Thank you! -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=347416
Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c1
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c2
Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
Jason Douglas
https://bugzilla.novell.com/show_bug.cgi?id=347416
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c3
--- Comment #3 from Andreas Jaeger
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c4
--- Comment #4 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c5
--- Comment #5 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c6
--- Comment #6 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c7
--- Comment #7 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c8
--- Comment #8 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c9
--- Comment #9 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c10
--- Comment #10 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c11
--- Comment #11 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c12
--- Comment #12 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c13
--- Comment #13 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c14
--- Comment #14 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c15
--- Comment #15 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c17
--- Comment #17 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c18
--- Comment #18 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c19
--- Comment #19 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c20
--- Comment #20 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c21
--- Comment #21 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c22
--- Comment #22 from Jan Beulich
Of course, the machine won't reboot, because the kernel on my original 10.3 media getting loaded into domU appears to be incompatible with the dom0 kernel...
ERROR (2,'Invalid kernel','xc_dom_compat_check: guest type xen-3.0-x86_32 not supported by Xen kernel, sorry\n')
I assume this was after you changed from PAE to non-PAE (or the other way around) without re-creating the guest... But - if I get you right, you succeeded to install a new VM from CD, while in the same system configuration the network install failed. If that's correct, then I agree to
I wonder if my virtual machines, when under high network load during operation, might cause a server halt, thus halting ALL the virtual machines on the server plus the server itself plus my ability to remote restart them. In other words, I wonder if this bug is related to the network, and NOT to just the installation per se.
Rather than continuing the route you outlined, it may thus be much more interesting to see whether you'd get the box to die with (high) network traffic unrelated to VM installation/operation, and whether using a completely different NIC model (specifically driven by another driver) would avoid the problem (in fact I cannot even assure you anyone internally ever tested on a machine using the bnx2 driver - we just have to assume that the hardware drivers work well in Xen if they do so in native, which generally is a valid assumption). Btw., in light of another bug recently fixed upstream: The VM(s) you install don't take an unusually large (i.e. anything >1 ;-)) number of virtual disks or NICs? (But even if so, this could crash dom0, but shouldn't hang the hypervisor.) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c23
--- Comment #23 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c24
Jan Beulich
BUT, when the domU machine went to do the first reboot (preparatory to setting the root password and continuing with YaST install), the boot failed before domU even opened a console window. The popup message appeared to come from the vm-install tool, and was that "guest type xen-3.0-x86_32 not supported by Xen kernel, sorry" message (complete with the escaped newline showing :-).
That message, iirc, means you have a non-PAE guest with a PAE hypervisor/dom0. If you didn't customize the install, then that would seem like a YaST bug.
... My intent is to try to mount the domU filesystem from dom0, and manually copy in the kernels from dom0's boot area into the domU filesystem, and hope that lets the domU machine start and continue the installation procedure?
That ought to work, but ...
I understand that what you want me to do now is try to get this domU machine to boot, and run network load testing from within that machine. Fantastic! I will make that my next move. As soon as I can get the domU machine to finish the install.
Not necessarily. It may even be sufficient to put dom0 alone under high network load. But as I think more about it, the bug mentioned above may well affect you. The patch is queued for backporting to 10.3, I just need to find time to actually get it done. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=347416
User gb2@c5i.net added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c25
--- Comment #25 from Glen Barney
https://bugzilla.novell.com/show_bug.cgi?id=347416
User aj@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=347416#c26
Andreas Jaeger
participants (1)
-
bugzilla_noreply@novell.com