[Bug 394566] New: vmlinuz-2.6.25.4-2-xen crashes in early stage of shutdown
https://bugzilla.novell.com/show_bug.cgi?id=394566 Summary: vmlinuz-2.6.25.4-2-xen crashes in early stage of shutdown Product: openSUSE 11.0 Version: Beta 3 Platform: i686 OS/Version: openSUSE 11.0 Status: NEW Severity: Normal Priority: P5 - None Component: Xen AssignedTo: cgriffin@novell.com ReportedBy: gu.schwarz@web.de QAContact: qa@suse.de Found By: Beta-Customer Created an attachment (id=218194) --> (https://bugzilla.novell.com/attachment.cgi?id=218194) output kernel bug in /var/log/messages Kernel vmlinuz-2.6.25.4-2-xen crashes on shutdown or even when trying to change from runlevel 5 to 3. Output in /var/log/messages as attachement. Hardware: FSC Esprimo P5925 Intel iQ35 Intel Core 2 Duo E8200 2,66 GHz BIOS Version V6.00 R1.11.2584_A1 Software: openSusE 11.0 beta 3 64bit Tests done so far: - booting with acpi=off: no success - shutting down the xen network bridge prior to init 3: no success This is my first entry in this system. Please excuse if the bug description is missing something. I'm willing to run additional tests, but as a non-expert for kernel-related problems will need descriptive instructions to do so. Günther -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 Charles Arnold <carnold@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lbendixs@novell.com AssignedTo|cgriffin@novell.com |jbeulich@novell.com QAContact|qa@suse.de |jdouglas@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c1 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |gu.schwarz@web.de --- Comment #1 from Jan Beulich <jbeulich@novell.com> 2008-05-29 04:19:15 MDT --- First of all I take it for granted that the same issue doesn't exist when you run the native kernel. We'll need Xen's boot messages (you can obtain them through 'xm dmesg' right after boot completed). We'd like you to try leaving DRM (and if that doesn't help, AGP) out of the picture. For the former, it should be sufficient to re-configure X to not use 3D acceleration. You can verify this after X came up in that the modules i915 and drm aren't present. If that doesn't suffice, you'd need to either blacklist or move away/rename the respective modules under /lib/modules/2.6.25.4-2-xen/. For AGP, you'd need to use the latter method in any case (the modules in question here are intel-agp and agpgart). If forcibly leaving out modules, is is possible that X cannot start up at all anymore, in which case some more re-configuring of X would be needed. Another thing we'd like you to try is enable debugging in the drm kernel module by echoing 1 into /sys/module/drm/parameters/debug. Once done, shut down, and restarted, we'd need the full portion of /var/log/messages pertaining to the previous session (not just the oops portion of it) plus the boot messages (which at this point would be in /var/log/boot.omsg). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c2 --- Comment #2 from Günther Schwarz <gu.schwarz@web.de> 2008-05-30 14:35:13 MDT --- (In reply to comment #1 from Jan Beulich)
First of all I take it for granted that the same issue doesn't exist when you run the native kernel.
Yes indeed. Unlike previous Suse kernels the default one runs nicely on this PC. That is why I became interested in the beta release.
We'll need Xen's boot messages (you can obtain them through 'xm dmesg' right after boot completed).
I've attached these files before and after disabling the four modules i915, drm, intel-agp and agpgart as well as output from /var/log/boot.omsg and the relevant parts of /var/log/messages. Actually I can't see how to attach a file when replying in this web based system. So I will write another entry with the attachments. The files a purged from IP addresses and other information I do not want to publish. You will notice from the file that there were problem with the network. I did not notice these on my previous tries. Without these modules I'm indeed able to shut down the system. I can run further tests upon request. But since I submitted my report I noticed that the new SLED10 SP2 runs very nicely on this system. Much better than the openSuSE versions I tried so far (10.2 almost impossible, 10.3 with issues not only with Xen, and this beta of 11.0). So I will most likely stay with SLED. Günther -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c3 --- Comment #3 from Günther Schwarz <gu.schwarz@web.de> 2008-05-30 14:37:57 MDT --- Created an attachment (id=219265) --> (https://bugzilla.novell.com/attachment.cgi?id=219265) output of xm dmesg with all modules loaded -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c4 --- Comment #4 from Günther Schwarz <gu.schwarz@web.de> 2008-05-30 14:38:45 MDT --- Created an attachment (id=219266) --> (https://bugzilla.novell.com/attachment.cgi?id=219266) output of xm dmesg with modules blacklisted -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c5 --- Comment #5 from Günther Schwarz <gu.schwarz@web.de> 2008-05-30 14:39:29 MDT --- Created an attachment (id=219267) --> (https://bugzilla.novell.com/attachment.cgi?id=219267) /var/log/boot.omsg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c6 --- Comment #6 from Günther Schwarz <gu.schwarz@web.de> 2008-05-30 14:40:30 MDT --- Created an attachment (id=219268) --> (https://bugzilla.novell.com/attachment.cgi?id=219268) part of /var/log/messages -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c7 --- Comment #7 from Jan Beulich <jbeulich@novell.com> 2008-06-02 01:21:23 MDT --- Okay, but that's only part of what we'd need: With all four modules suppressed, did you reconfigure X so it would still come up (I'm pretty certain it doesn't come up without reconfiguration)? Did you try suppressing just drm and i915? The boot.omsg and messages fragment we need to see should be from a boot with all modules enabled, and with DRM debugging enabled prior to initiating shutdown (as described). I take your comment about SLED to say that there you don't have the problem reported here? As cross reference knowledge, would you check whether the drm/i915 pair is in use there? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c8 --- Comment #8 from Günther Schwarz <gu.schwarz@web.de> 2008-06-02 12:32:08 MDT --- Created an attachment (id=219605) --> (https://bugzilla.novell.com/attachment.cgi?id=219605) out of /var/log/messages with modules loaded -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c9 --- Comment #9 from Günther Schwarz <gu.schwarz@web.de> 2008-06-02 12:32:57 MDT --- Created an attachment (id=219607) --> (https://bugzilla.novell.com/attachment.cgi?id=219607) output of /var/log/boot.omsg with modules loaded -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c10 --- Comment #10 from Günther Schwarz <gu.schwarz@web.de> 2008-06-02 12:34:11 MDT --- Created an attachment (id=219608) --> (https://bugzilla.novell.com/attachment.cgi?id=219608) out of /var/log/messages with modules loaded -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c11 --- Comment #11 from Günther Schwarz <gu.schwarz@web.de> 2008-06-02 12:34:57 MDT --- (In reply to comment #7 from Jan Beulich)
With all four modules suppressed, did you reconfigure X so it would still come up (I'm pretty certain it doesn't come up without reconfiguration)?
Not surprisingly X does not start. Sax2 fails with 'module not found': FATAL: Could not open '/lib/modules/2.6.25.4-2-xen/kernel/drivers/char/drm/i915.ko': No such file or directory [drm] failed to load kernel module "i915"
Did you try suppressing just drm and i915?
No.
The boot.omsg and messages fragment we need to see should be from a boot with all modules enabled, and with DRM debugging enabled prior to initiating shutdown (as described).
OK: I booted with all four modules in their usual place, echoed 1 to the drm bug parameter and initiated a shut down. Output of /var/log/messages and /var/log/boot.omsg is in the two attachments.
I take your comment about SLED to say that there you don't have the problem reported here? As cross reference knowledge, would you check whether the drm/i915 pair is in use there?
Yes, they are in use and seem to work just fine:
ssh test uname -a Linux tfkp10 2.6.16.60-0.23-xen #1 SMP Thu May 15 06:38:31 UTC 2008 x86_64 x86_64 x86_64 GNU/Linux ssh test lsmod | grep i915 i915 28672 1 drm 91944 2 i915 The new kernel for SP2 is seemingly heavily patched backwards. A few month ago I tried to install SP1 in this hardware and gave up very quickly.
As you give a Novell email adress: if this PC is of high priority for you it might be more efficient to ask FSC for a sample than me trying to follow your instructions. They are interested to get the thing certified for Linux anyway. Günther -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 Günther Schwarz <gu.schwarz@web.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #219608|0 |1 is obsolete| | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c12 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|gu.schwarz@web.de |daniel.rahn@novell.com --- Comment #12 from Jan Beulich <jbeulich@novell.com> 2008-06-04 02:33:18 MDT --- I'm pretty convinced that without being able to touch a machine this can be reproduced on attempts to try to understand what's going on will lead no-where (unless someone knowing drm/agp well would help out). Therefore, getting the exact machine model would be the first option (Daniel), locating one with the same chipset would then be the next fallback solution (Jason/Lynn). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c13 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|daniel.rahn@novell.com |carnold@novell.com --- Comment #13 from Jan Beulich <jbeulich@novell.com> 2008-06-18 05:21:47 MDT --- With the machine at hand it wasn't difficult to figure out: 2.6.24 split the AGP memory page destruction in two phases (unmap and free). While in 2.6.25 this really isn't needed anymore, it was kept that way for an unknown to me reason. The problem is that there is hidden assumption that gart_to_virt() returns the same result for the same input before and after unmapping a page, which is wrong for Xen. 2.6.24 merge patch updated accordingly. Charles, once checked in, please hand over to Lynn for getting a test build (which ideally the originator would be given for verification, but if that's unfeasible it should be possible to find a machine qith an Intel Q35 chipset over there). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User carnold@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c14 Charles Arnold <carnold@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |carnold@novell.com Info Provider|carnold@novell.com |lbendixs@novell.com --- Comment #14 from Charles Arnold <carnold@novell.com> 2008-06-18 08:30:56 MDT --- Patch has been committed. We can build and test. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User gu.schwarz@web.de added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c15 --- Comment #15 from Günther Schwarz <gu.schwarz@web.de> 2008-06-18 12:14:10 MDT --- (In reply to comment #13 from Jan Beulich)
2.6.24 merge patch updated accordingly. Charles, once checked in, please hand over to Lynn for getting a test build (which ideally the originator would be given for verification, but if that's unfeasible it should be possible to find a machine qith an Intel Q35 chipset over there).
If it is of any help I can try the updated kernel. But do not except a reply within a few hours as the machine in question is now operative with the 2.616 kernel of SLED. I will have to wait until the user can spare it for a moment. Günther -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c16 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |funtasyspace@yahoo.com --- Comment #16 from Jan Beulich <jbeulich@novell.com> 2008-06-19 03:53:28 MDT --- *** Bug 400521 has been marked as a duplicate of this bug. *** https://bugzilla.novell.com/show_bug.cgi?id=400521 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User jbeulich@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c17 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|lbendixs@novell.com | Resolution| |FIXED --- Comment #17 from Jan Beulich <jbeulich@novell.com> 2008-06-19 04:05:47 MDT --- patches committed -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=394566 User meissner@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=394566#c18 --- Comment #18 from Marcus Meissner <meissner@novell.com> 2008-07-08 08:50:56 MDT --- 11.0 update kernel released, version-release is 2.6.25.9-0.2 -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com