[Bug 675363] New: Random lockups with kernel-xen. Possibly graphics related
https://bugzilla.novell.com/show_bug.cgi?id=675363 https://bugzilla.novell.com/show_bug.cgi?id=675363#c0 Summary: Random lockups with kernel-xen. Possibly graphics related Classification: openSUSE Product: openSUSE 11.4 Version: RC 1 Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: jdouglas@novell.com ReportedBy: jfunk@funktronics.ca QAContact: qa@suse.de Found By: --- Blocker: --- Created an attachment (id=416401) --> (http://bugzilla.novell.com/attachment.cgi?id=416401) Netconsole log containing Oops User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:2.0b11) Gecko/20110203 Firefox/4.0b11 I have been experiencing some random lockups when running Xen with Xorg in Dom0. I've been having this problem with 11.3 as well. Interestingly, as long as I'm not running X in Dom0, it's as stable as can be, but it crashes fairly often when I'm running X. More often than not, I'm scrolling a PDF when it locks up or reboots. When it locks up, sysctl keys do not work. I managed to capture the oops via netconsole. Sadly, syslog-ng has mangled it somewhat. Reproducible: Sometimes -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c
Charles Arnold
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c1
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c2
--- Comment #2 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c3
--- Comment #3 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c4
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c5
Jan Beulich
Created an attachment (id=416665) --> (http://bugzilla.novell.com/attachment.cgi?id=416665) [details] Xen log
This is the xend log, not the hypervisor one. If the system comes up, "xm dmesg" will give you what is needed. If it doesn't, your only option is using a serial console. You'll want to add "loglvl=all guest_loglvl=all" to the Xen command line in any case. (In reply to comment #2)
Created an attachment (id=416664) --> (http://bugzilla.novell.com/attachment.cgi?id=416664) [details] Boot message log
Please also provide a native kernel's boot messages for comparison. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c6
--- Comment #6 from Jan Beulich
Created an attachment (id=416666) --> (http://bugzilla.novell.com/attachment.cgi?id=416666) [details] Netconsole log with more history
I updated to RC2, and set mem=4G.
Confusing: The oops message contains various pointers (e.g. the CR3 and RSP values) that indicate that Dom0 has more than 4G of memory, which implies that Xen must have more than 4G of memory too. The boot log in #2, however, shows that memory was restricted. Were the two taken from different boots (and did you perhaps leave off the mem=4G from the second)? Otherwise, only a single complete log will help understanding what else is going on here. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c7
--- Comment #7 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c8
--- Comment #8 from James Oakley
(In reply to comment #4)
Created an attachment (id=416666) --> (http://bugzilla.novell.com/attachment.cgi?id=416666) [details] [details] Netconsole log with more history
I updated to RC2, and set mem=4G.
Confusing: The oops message contains various pointers (e.g. the CR3 and RSP values) that indicate that Dom0 has more than 4G of memory, which implies that Xen must have more than 4G of memory too. The boot log in #2, however, shows that memory was restricted. Were the two taken from different boots (and did you perhaps leave off the mem=4G from the second)? Otherwise, only a single complete log will help understanding what else is going on here.
I restricted the log to that day, so there are multiple crashes in the netconsole log. Also, an Oops is not generated every time. Sometimes, it just suddenly restarts. I'm building a kernel with the patch you posted. Hopefully that will work out. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c9
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c10
--- Comment #10 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c11
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c12
--- Comment #12 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c13
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c14
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c15
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c16
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c17
--- Comment #17 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c18
--- Comment #18 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c19
--- Comment #19 from Charles Arnold
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c20
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c21
Charles Arnold
He'll also need yesterday's hypervisor.
I've also uploaded the hypervisor and xen-libs and xen-tools to the same location specified in comment #19. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c22
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c23
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c24
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c25
--- Comment #25 from Jan Beulich
In any case, those messages tell us there's something fishy going on.
And your graphics worked fine nevertheless? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c26
--- Comment #26 from Jan Beulich
If you could filter out the "PFN ... used as IOMEM" messages (apart from perhaps the first few ones and, as you already did, the last few ones; similarly for any "RAM ... used as IOMEM" ones), ...
From the fragment you supplied it would seem that these messages may repeat for a small set of numbers over and over - can you check that's really the case? That might provide further insight as to where these mappings are coming from (and might be a reason why, for quite a while, things seem to be working for you).
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c27
--- Comment #27 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c28
--- Comment #28 from Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c29
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c30
Jan Beulich
I really hope this provides enough information. This is a work system and today is my last day so I won't be able to do much testing. :-)
Does this mean we won't be able to make any progress here anymore? If so, could you at least still provide full hardware details (namely /proc/cpuinfo contents and "lspci -nn" output)? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c31
James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c32
--- Comment #32 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c33
--- Comment #33 from James Oakley
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c34
Jan Beulich
Created an attachment (id=424794) --> (http://bugzilla.novell.com/attachment.cgi?id=424794) [details] Full serial log, with "used as IOMEM?" messages filtered.
Unfortunately the log is incomplete (and thus at best of limited use) - the output rate is too high, so the serial console dropped characters. This would need to be redone with "sync_console" added to the Xen command line. Also, as indicated earlier, you will need to get under control the cooling problem of that system (which at once would further limit the amount of output). But let's first see whether we can locate a machine similar to yours (particularly wrt graphics) in our lab, and see whether this is reproducible. Preston, can you please do that? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c36
Preston Millett
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c
Jan Beulich
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c37
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c38
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c39
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c40
Swamp Workflow Management
https://bugzilla.novell.com/show_bug.cgi?id=675363
https://bugzilla.novell.com/show_bug.cgi?id=675363#c41
Jan Beulich
participants (1)
-
bugzilla_noreply@novell.com