[Bug 551598] New: openSUSE 11.2 rc2 fails to boot in fully virtualized Xen VM on SLES 10 SP2 or SLES 11 GM
http://bugzilla.novell.com/show_bug.cgi?id=551598 Summary: openSUSE 11.2 rc2 fails to boot in fully virtualized Xen VM on SLES 10 SP2 or SLES 11 GM Classification: openSUSE Product: openSUSE 11.2 Version: RC 2 Platform: x86-64 OS/Version: openSUSE 11.2 Status: NEW Severity: Blocker Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: jahudson@novell.com QAContact: qa@suse.de Found By: --- The OS was installed with the Base Pattern only to save download time. LVM was used. Mostly default settings were used throughout the install. After the first reboot openSUSE starts booting but ends up getting scsi errors and stops. I then transferred the image to a SLES 11 system think that perhaps openSUSE 11.2 was not compatible with SLES 10. The VM running in a SLES 11 virtual host behaves the same way. I setup console to serial and connected to the xen console so I could retrieve the errors. Here's a snippet from when the problem starts. I'll include the entire log following this msgs. [ 6.125143] ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 5 [ 6.127971] xen-platform-pci 0000:00:03.0: PCI INT A -> Link[LNKD] -> GSI 5 (level, low) -> IRQ 5 [ 6.133236] Xen version 3.3. [ 6.134127] Hypercall area is 1 pages. [ 6.157113] IRQ 5/xen-platform-pci: IRQF_DISABLED is not guaranteed on shared IRQs [ 6.213582] suspend: event channel 4 [ 36.704474] ata1: lost interrupt (Status 0x0) [ 36.707490] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [ 36.710785] ata1.00: cmd c8/00:20:09:63:5a/00:00:00:00:00/e0 tag 0 dma 16384 in [ 36.710787] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4 (timeout) [ 36.718208] ata1.00: status: { DRDY } [ 36.720860] ata1: soft resetting link [ 36.874597] ata1.00: revalidation failed (errno=-2) [ 41.873881] ata1: soft resetting link [ 42.028607] ata1.00: revalidation failed (errno=-2) [ 47.027861] ata1: soft resetting link [ 47.182568] ata1.00: revalidation failed (errno=-2) [ 47.185088] ata1.00: disabled [ 47.187126] ata1.00: device reported invalid CHS sector 0 [ 47.190596] ata1: soft resetting link [ 47.344321] ata1: EH complete [ 47.345593] sd 0:0:0:0: [sda] Unhandled error code [ 47.347853] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK [ 47.352065] end_request: I/O error, dev sda, sector 5923593 -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 User jahudson@novell.com added comment http://bugzilla.novell.com/show_bug.cgi?id=551598#c1 --- Comment #1 from Jared Hudson <jahudson@novell.com> 2009-10-30 15:44:33 MDT --- Created an attachment (id=324958) --> (http://bugzilla.novell.com/attachment.cgi?id=324958) xm console output -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c3 Tejun Heo <teheo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |jahudson@novell.com --- Comment #3 from Tejun Heo <teheo@novell.com> 2009-11-23 02:29:29 UTC --- Hmmm... First, command timed out and then revalidation failed with -ENOENT which means that libata is failing to read IDENTIFY data off the QEMU emulated drive. Can you please boot with ignore_loglevel and post the console output? It will show us why IDENTIFY reading is failing. Given that the failure is on QEMU emulated devices, I don't think this has much to do with the ata_piix driver itself. It looks like IRQ delivary failed for some reason under xen (which seems somewhat common) and then QEMU disk emulation just freaked out on recovery sequence. Thanks. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c4 Jared Hudson <jahudson@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Version|RC 2 |Final Info Provider|jahudson@novell.com | --- Comment #4 from Jared Hudson <jahudson@novell.com> 2009-11-24 17:22:07 CST --- I just did a fresh install with openSUSE 11.2 final. Now it does boot but still produces errors. They're just no longer fatal. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c5 --- Comment #5 from Jared Hudson <jahudson@novell.com> 2009-11-24 17:22:58 CST --- Created an attachment (id=329340) --> (http://bugzilla.novell.com/attachment.cgi?id=329340) opensuse11.2_ignore_loglevel.txt -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c6 Tejun Heo <teheo@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |agraf@novell.com --- Comment #6 from Tejun Heo <teheo@novell.com> 2009-11-25 00:48:17 UTC --- Those HSM failures are from QEMU emulated cdroms and different from the original ones you reported. ISTR simliar problem with qemu-kvm. cc'ing Alex. Alex, does xen-qemu work about the same as qemu-kvm? Jared is reporting HSM violations on QEMU cdrom device and IIRC there was similar issue with qemu-kvm, right? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c7 Alexander Graf <agraf@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jbeulich@novell.com, | |jfehlig@novell.com --- Comment #7 from Alexander Graf <agraf@novell.com> 2009-11-25 07:24:07 UTC --- Xen uses its own fork of qemu for HVM. So chances are pretty good that it's similar. The issue with qemu-kvm was/is only triggered on eject though. This looks more related to pv-ops or something similar. It seems like OpenSUSE 11.2 knows it's running inside Xen and loses interrupts? (rough guess) So let's ask Jan and Jim if they know anything here. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c8 Jan Beulich <jbeulich@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ksrinivasan@novell.com --- Comment #8 from Jan Beulich <jbeulich@novell.com> 2009-11-25 09:06:59 UTC --- That seems to be the 11.2 incarnation of a previously reported bug (and I thought we wouldn't repeat the same mistake): Once the pv drivers are being installed, the "native" ones (i.e. libata and friends) shouldn't be loaded anymore, as the pv drivers disable their respective PCI devices when they load. KY should have the best overview of what was done where to accommodate for that behavior, and hence who should do what change. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c9 Alexander Graf <agraf@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|teheo@novell.com |ksrinivasan@novell.com --- Comment #9 from Alexander Graf <agraf@novell.com> 2009-12-20 21:02:09 UTC --- KY, mind to shed some light on this? -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c10 --- Comment #10 from Kattiganehalli srinivasan <ksrinivasan@novell.com> 2009-12-21 19:05:56 UTC --- On sles11, if I remember correctly, the problem we had was that disks would appear both as an IDE disk (managed by the PV driver) and as a SCSI disk managed by the libata/scsi driver stack. The way we dealt with this problem was to ensure that the PV drivers were loaded prior to loading the libata. Look at the file xen_pvdrivers under /etc/modprobe.d/ We could try a similar solution here. We would need to get the installation team involved to make the necessary changes. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c11 --- Comment #11 from Kattiganehalli srinivasan <ksrinivasan@novell.com> 2010-01-20 15:41:45 UTC --- Created an attachment (id=337683) --> (http://bugzilla.novell.com/attachment.cgi?id=337683) Preserve compatibility -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c12 Kattiganehalli srinivasan <ksrinivasan@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |jahudson@novell.com --- Comment #12 from Kattiganehalli srinivasan <ksrinivasan@novell.com> 2010-01-20 15:44:23 UTC --- The problem appears to be in the new PV drivers that we picked up. I have submitted a patch for this problem on sle 11 sp1 code base. This patch should address the problem here as well. The patch is attached (comment #11). Charles, if there is any update planned for 11.2, could you include this patch. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
http://bugzilla.novell.com/show_bug.cgi?id=551598 http://bugzilla.novell.com/show_bug.cgi?id=551598#c13 Kattiganehalli srinivasan <ksrinivasan@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |RESOLVED Info Provider|jahudson@novell.com | Resolution| |FIXED --- Comment #13 from Kattiganehalli srinivasan <ksrinivasan@novell.com> 2010-02-08 15:21:45 UTC --- I am am going to close this bug, since a patch has been submitted. -- Configure bugmail: http://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com