[Bug 350051] New: -xenpae kernels seem to be unusable when running on more than one VCPUs
https://bugzilla.novell.com/show_bug.cgi?id=350051 Summary: -xenpae kernels seem to be unusable when running on more than one VCPUs Product: openSUSE 10.3 Version: Final Platform: PC OS/Version: openSUSE 10.3 Status: NEW Severity: Critical Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: nice@titanic.nyme.hu QAContact: qa@suse.de Found By: Other As I mentioned in https://bugzilla.novell.com/show_bug.cgi?id=343181#c19 When I run a Linux system on a -xenpae kernel with more VCPUs either as dom0 or domU, the system will be HIGHLY unstable. I tried it with both the i386's /boot/xen-pae.gz and x86_64's /boot/xen.gz, and the results are the same. Kernel compiling on such a system is equal to suicide for example. First some segmentation fault occurs, then the kernel space's CPU load goes very high, and, at last, the system becomes totally unresponsive. I will attach a typical screenshot (which is combined of more screenshots) of our production mailserver's (SuSE Linux 8.2(!)) tty10 when running paravirtualized under the mentioned corcumstances as a domU. It crashed 4-5 times a day, when running with more VCPUs with a -xenpae kernel. Now it has run for some days without any problem with one VCPU. Running this 32 bit system on a 64bit kernel (either with one or more VCPUs) is OK as well. 64 bit kernels seem to be unaffected by this problem. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c1
--- Comment #1 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c2
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c3
--- Comment #3 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c4
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c5
Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c6
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c7
--- Comment #7 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c8
--- Comment #8 from Tamás Németh
That bug is about fully virtualized guests, not paravirtualized ones.
I would say that the bug is about the behavior of dom0 and paravirtualized guest in the presence of fully virtualized guests, instead. I don't have too much time, but I will try to provide some detailed informations about the instabilities. What would be useful for you? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c9
--- Comment #9 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c10
--- Comment #10 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c11
--- Comment #11 from Jan Beulich
-When the kernel compiling suddenly ends, take a look at /dev/tty10, or exatract the needed information in some other way quickly, because the system will become totally unresponsive in a few seconds.
Could you get us this information (even if similar to #1, I would still want it from a pure 10.3 setup)? Also, you seem to imply that that's happening with Dom0, too: In that case, I can't see how /block/xvda1/dev would come into the picture unless you also have DomU-s running. Also, are you sure you're not running into the time problem we're aware of and have a fix queued (bug 335121)? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c12
--- Comment #12 from Tamás Németh
What you're describing sounds more like problems with the originally shipped kernel, which indeed was severly broken. It happens on the newest 10.3 kernels, too.
Also, are you sure you're not running into the time problem we're aware of and have a fix queued (bug 335121)?
Oh, that bug seems to be the same as https://bugzilla.novell.com/show_bug.cgi?id=344877 don't you think so? Unfortunately I'm familiar with that issue but I think, that differs from this one. Seting the clocksource to jiffies was a good workaround for me, but currently I'm testing a 100Hz kernel instead. I will investigate if that bug is identical to this problem.
Also, you seem to imply that that's happening with Dom0, too: In that case, I can't see how /block/xvda1/dev would come into the picture unless you also have DomU-s running.
Yes, it definitely happens in in dom0 too, just like bug 335121. I just wrote about xvda because I could observe the tty10 logs of only a domU syses (that infamous SuSE Linux 8.2).
Could you get us this information (even if similar to #1, I would still want it from a pure 10.3 setup)?
- logs (Xen, kernel) showing anomalies - crash data (register+stack traces) - any other technical information you have.
Where can I get those kernel and Xen logs and crash data from? What do you mean exactly? (dmesg, xm dmesg, tty10, /var/log/messages or /var/log/xen/*) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c13
--- Comment #13 from Jan Beulich
Oh, that bug seems to be the same as https://bugzilla.novell.com/show_bug.cgi?id=344877 don't you think so?
Yes, as I implied in that bug's comment #8.
Where can I get those kernel and Xen logs and crash data from? What do you mean exactly? (dmesg, xm dmesg, tty10, /var/log/messages or /var/log/xen/*)
dmesg (or equivalent collected via serial) and xm dmesg (likewise using serial if necessary). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c14
--- Comment #14 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c15
--- Comment #15 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c16
--- Comment #16 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c17
--- Comment #17 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
User nice@titanic.nyme.hu added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c18
--- Comment #18 from Tamás Németh
https://bugzilla.novell.com/show_bug.cgi?id=350051
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c19
--- Comment #19 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jim.pye@pyenet.co.nz added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c20
Jim Pye
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c21
--- Comment #21 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jim.pye@pyenet.co.nz added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c22
--- Comment #22 from Jim Pye
https://bugzilla.novell.com/show_bug.cgi?id=350051
User jbeulich@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c23
--- Comment #23 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=350051
User cthiel@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=350051#c24
Christoph Thiel
participants (1)
-
bugzilla_noreply@novell.com