[opensuse] domUloader/kpartx issues
I'm hoping someone here can shed some light on this problem. The problem I'm having is with sles, not opensuse. I have a ticket open with Novell, my SAN vendor, and I've posted to xen mailing list. Any help is appreciated. I'm having a lot of problems with some pv domU's getting "Boot loader didn't return any data!" when trying to start them. Some will start fine on one server, but not on another with the same physical disk presented from a SAN. If I migrate the domU to the other server it will work, until it is shutdown. Then it won't restart. (Live Migrate works fine....This is only mentioned to show that the disk CAN be accessed and the domU CAN run on this server.) Here's some background. * Pooled servers are sles11sp1 with kernel 2.6.32.49-0.3-xen * xen version xen-4.0.2_21511_04-0.5.1 * multipath-tools-0.4.8-40.44.1 * VMs are sles11sp1 and some sles10 * Disks are provided by an Active/Active SAN * Servers are using multipath * Third party software (Convirture Enterprise) controls access to disk to prevent contention for the same disk * Live Migrate on functioning domU's works fine * If domU is then shut down, it will not start back up I can add the same disks to an hvm domU and boot off system-rescue CD and I can mount the ext3 file systems of the disks fine. . If I look at it with kpartx I get: xen07:~ # kpartx -l /dev/mapper/360030d900825cc06b07a721f577b3f62 360030d900825cc06b07a721f577b3f62p1 : 0 4194304 /dev/mapper/360030d900825cc06b07a721f577b3f62 2048 360030d900825cc06b07a721f577b3f62p2 : 0 37746688 /dev/mapper/360030d900825cc06b07a721f577b3f62 4196352 xen07:~ # kpartx -a /dev/mapper/360030d900825cc06b07a721f577b3f62 device-mapper: create ioctl failed: Device or resource busy create/reload failed on 360030d900825cc06b07a721f577b3f62p1 device-mapper: create ioctl failed: Device or resource busy create/reload failed on 360030d900825cc06b07a721f577b3f62p2 If I do this for a VM that I know 100% certain will boot, kpartx -a adds the mappings fine. Then I can delete them. How can I determine why a physical device is locked or busy and who has control of it? I have done quite a bit of testing and it seems that every time a domU gets this error, it is also not accessible with kpartx. I guess that makes sense. See the chart below. Two vms: Server vm1 vm2 xen05 boot & kpartx fail boot & kpartx fail xen06 boot & kpartx fail boot & kpartx fail xen07 boot & kpartx fail boot & kpartx success xen08 boot & kpartx success boot & kpartx success xen09 boot & kpartx success boot & kpartx success I have another issue that is possibly even related where I can't boot ANY pv domU's that are trying to boot off a full snapshot of the disk. At this point I don't know if it's a locking issue or not. I really need to get the above problem resolved then move on to the snapshot issue. Best Regards, James -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 17 Feb 2012 14:41:20 -0500 James Pifer <jep@obrien-pifer.com> wrote: Hi James, These factors should have given you pause about cross-posting here.
The problem I'm having is with sles, not opensuse. I have a ticket open with Novell I've posted to xen mailing list.
This one, too, in particular:
* Third party software (Convirture Enterprise) controls access to the disk ...
Maybe part of the problem is how you're describing it? Might I suggest something a little less complicated? You run Xen and have some guest VMs that will only boot on one host and not the other, although they'll 'live migrate' to the other and run there until shut down. I found lots of intriguing clues and even some strong hints via Google searching for 'pv domu "Boot loader didn't return any data!"' My guess is the bootloader is not properly configured on the affected VMs. Good luck! Carl -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
On Fri, 2012-02-17 at 16:13 -0500, Carl Hartung wrote:
On Fri, 17 Feb 2012 14:41:20 -0500 James Pifer <jep@obrien-pifer.com> wrote:
Hi James,
These factors should have given you pause about cross-posting here.
The problem I'm having is with sles, not opensuse. I have a ticket open with Novell I've posted to xen mailing list.
I consider the people on this list a valuable resource. As Opensuse is closely related to SLES, I don't feel it out of the question to ask a question here (IMHO).
This one, too, in particular:
* Third party software (Convirture Enterprise) controls access to the disk ...
If I didn't include this someone would very likely question multiple servers with the same access to physical disks. It was only to state that safe guards are in place for contention to the same disk.
Maybe part of the problem is how you're describing it? Might I suggest something a little less complicated? You run Xen and have some guest VMs that will only boot on one host and not the other, although they'll 'live migrate' to the other and run there until shut down.
The configuration IS somewhat complicated. I've been scolded enough times on mailing lists for not providing enough information.
I found lots of intriguing clues and even some strong hints via Google searching for 'pv domu "Boot loader didn't return any data!"' My guess is the bootloader is not properly configured on the affected VMs.
Wish is was that easy. If the bootloader wasn't configured right the vms wouldn't start on any servers, but they do. The same config is used across the servers. kpartx is not able to map the devices, so domUloader.py is not able to get the kernel from the disk. The question is why is kpartx having problems on some servers and not others, when the disk is not being used. Theoretically it shouldn't be locked/busy. Thanks, James -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org
participants (2)
-
Carl Hartung
-
James Pifer