[opensuse-cloud] dnsmasq: cannot open or create lease file Permission denied
Hi, On Opensuse leap 42.1 with liberty, I do not get dhcp to work. It seems I get a basic error but I cannot find what is wrong. Anybody any idea? BR, Jeroen. +++++++++from: dhcp-agent.log 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent raise RuntimeError(m) 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent RuntimeError: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-dbb6455d-6d17-460a-8401-b57cf73640e8', 'dnsmasq', '--no-hosts', '--no-resolv', '--strict-order', '--except-interface=lo', '--pid-file=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/pid', '--dhcp-hostsfile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/host', '--addn-hosts=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/addn_hosts', '--dhcp-optsfile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/opts', '--dhcp-leasefile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/leases', '--dhcp-match=set:ipxe,175', '--bind-interfaces', '--interface=tape68276d1-e6', '--dhcp-range=set:tag0,10.172.200.0,static,86400s', '--dhcp-lease-max=256', '--conf-file=', '--domain=openstacklocal'] 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Exit code: 3 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stdin: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stdout: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stderr: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent dnsmasq: cannot open or create lease file /var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/leases: Permission denied ++++++++++++++++++ +++++++++++++++content of the directory :/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8 # l total 20 drwxr-xr-x 2 neutron neutron 79 Feb 15 10:13 ./ drwxr-xr-x 4 neutron neutron 94 Feb 15 09:47 ../ -rw-r--r-- 1 neutron neutron 68 Feb 15 10:13 addn_hosts -rw-r--r-- 1 neutron neutron 67 Feb 15 10:13 host -rw-r--r-- 1 neutron neutron 14 Feb 15 10:13 interface -rw-r--r-- 1 neutron neutron 47 Feb 15 10:13 leases -rw-r--r-- 1 neutron neutron 140 Feb 15 10:13 opts ++++++++++++++++++ -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 2016-02-15 12:56, Jeroen Groenewegen van der Weyden wrote:
Hi,
On Opensuse leap 42.1 with liberty, I do not get dhcp to work. It seems I get a basic error but I cannot find what is wrong.
Anybody any idea?
it seems, the directory is owned by neutron, but using the rootwrap, this is starting dnsmasq as root, so that it can bind port 53, but then it will probably want to give away the extra privileges by changing to the "dnsmasq" user, which then gets a permission denied for writing on that dir. OTOH on a working cloud, I can see the leases file being owned by neutron - shouldnt neutron be the one writing it and dnsmasq just using it read-only? Ciao Bernhard M.
+++++++++from: dhcp-agent.log 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent File "/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py", line 159, in execute 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent raise RuntimeError(m) 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent RuntimeError: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Command: ['sudo', 'neutron-rootwrap', '/etc/neutron/rootwrap.conf', 'ip', 'netns', 'exec', 'qdhcp-dbb6455d-6d17-460a-8401-b57cf73640e8', 'dnsmasq', '--no-hosts', '--no-resolv', '--strict-order', '--except-interface=lo', '--pid-file=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8 /pid',
'--dhcp-hostsfile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf736 40e8/host',
'--addn-hosts=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640 e8/addn_hosts',
'--dhcp-optsfile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf7364 0e8/opts',
'--dhcp-leasefile=/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf7 3640e8/leases',
'--dhcp-match=set:ipxe,175', '--bind-interfaces',
'--interface=tape68276d1-e6', '--dhcp-range=set:tag0,10.172.200.0,static,86400s', '--dhcp-lease-max=256', '--conf-file=', '--domain=openstacklocal'] 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Exit code: 3 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stdin: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stdout: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent Stderr: 2016-02-15 10:14:20.986 4253 ERROR neutron.agent.dhcp.agent dnsmasq: cannot open or create lease file /var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8/leases: Permission denied ++++++++++++++++++
+++++++++++++++content of the directory :/var/lib/neutron/dhcp/dbb6455d-6d17-460a-8401-b57cf73640e8 # l total 20 drwxr-xr-x 2 neutron neutron 79 Feb 15 10:13 ./ drwxr-xr-x 4 neutron neutron 94 Feb 15 09:47 ../ -rw-r--r-- 1 neutron neutron 68 Feb 15 10:13 addn_hosts -rw-r--r-- 1 neutron neutron 67 Feb 15 10:13 host -rw-r--r-- 1 neutron neutron 14 Feb 15 10:13 interface -rw-r--r-- 1 neutron neutron 47 Feb 15 10:13 leases -rw-r--r-- 1 neutron neutron 140 Feb 15 10:13 opts ++++++++++++++++++
-----BEGIN PGP SIGNATURE----- Version: GnuPG v2 iEYEARECAAYFAlbBx7oACgkQSTYLOx37oWSNVgCgyezgu3lcSfVkXh8tT447QPkv nYwAoN3QVCpe0/EaShdFoKxHHGvjAVsM =PXGP -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi Jeroen, Zitat von Jeroen Groenewegen van der Weyden <groen692@grosc.com>:
Hi,
On Opensuse leap 42.1 with liberty, I do not get dhcp to work. It seems I get a basic error but I cannot find what is wrong.
Anybody any idea?
might it be you're running with AppArmor activated? The default profile for dnsmasq does not allow for access to files in /var/lib/neutron. Regards, Jens -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi Jens, You were right, Apparmor was activated. After deactivating apparmor dhcp is running. Thank you. BR, Jeroen. Op 15-2-2016 om 18:43 schreef Jens-U. Mozdzen:
Hi Jeroen,
Zitat von Jeroen Groenewegen van der Weyden <groen692@grosc.com>:
Hi,
On Opensuse leap 42.1 with liberty, I do not get dhcp to work. It seems I get a basic error but I cannot find what is wrong.
Anybody any idea?
might it be you're running with AppArmor activated? The default profile for dnsmasq does not allow for access to files in /var/lib/neutron.
Regards, Jens
-- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi all, I'm running Openstack Mitaka in a Leap environment, 1 controller, 2 compute nodes (for testing purposes one with xen, the other with kvm) and the storage backend is a ceph cluster consisting of three nodes, all the relevant services use ceph as storage backend (glance, cinder-volume, nova). I have uploaded two different images to glance, a small cirros image and a leap image. Now when I launch an instance from image and choose to create a new volume for that, it's no problem. But if I try to boot from image without a volume, I get an error on my xen-compute nova-compute.log: ---cut here--- libvirtError: internal error: libxenlight failed to create new domain 'instance-00000229' ---cut here--- and libxl reports: ---cut here--- 2016-05-19 11:38:44 CEST libxl: error: libxl_device.c:300:libxl__device_disk_set_backend: Disk vdev=xvda failed to stat: rbd:images/39c61537-52e5-487c-9ec4-457b3612f549_disk:<CEPH-CREDENTIALS: No such file or directory 2016-05-19 11:38:44 CEST libxl: error: libxl_create.c:930:initiate_domain_create: Unable to set disk defaults for disk 0 ---cut here--- I've been debugging this and found out that the resulting xml config is missing the driver_name, which has to be 'qemu' in this case. I find the libxl error saying there is no such file kind of misleading, but it's the consequence of nova being unaware of the right backenddriver. I tried different ways to test it, I used the xml config of a failing instance (from debug output) and wanted to "virsh define" that VM, it failed until I added "name='qemu'" to the driver tag. Then as a workaround I added "qemu" directly into the python code, which works fine now. ---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u config.py.dist config.py --- config.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ config.py 2016-05-19 13:34:54.564961720 +0200 @@ -745,6 +745,8 @@ dev.set("type", self.source_type) dev.set("device", self.source_device) + if (self.target_bus == 'xen'): + self.driver_name = 'qemu' if (self.driver_name is not None or self.driver_format is not None or self.driver_cache is not None or ---cut here--- If you use a volume to launch that instance from, there is a function call libvirt_utils.pick_disk_driver_name which also provides qemu to the xml string and the instance gets started successfully. The difference to kvm is that there is no driver name in the generated xml config, but virsh dumpxml <instance> provides: ---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> ---cut here--- So I have to assume that libxl does not add the driver name to the xen config but it does to the kvm config. @Jim Fehlig: As I found your name many times in the changelogs of libvirt I hoped you could give some advice or any comment on that. Regards, Eugen -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Eugen Block wrote:
Hi all,
I'm running Openstack Mitaka in a Leap environment, 1 controller, 2 compute nodes (for testing purposes one with xen, the other with kvm)
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates, or something newer? I added support for network based block devices (including rbd) to the libvirt libxl driver in the libvirt 1.3.2 release cycle, so you'll need libvirt
= 1.3.2 for this to work with xen.
Options are: * use Tumbleweed * wait for Leap 42.2 * update your Leap xen compute nodes to the packages in the Virtualization repo http://download.opensuse.org/repositories/Virtualization/openSUSE_Leap_42.1/ The last option requires updating all the virt-related packages. We have no automated tests for such a configuration, so your mileage may vary. Regards, Jim
and the storage backend is a ceph cluster consisting of three nodes, all the relevant services use ceph as storage backend (glance, cinder-volume, nova). I have uploaded two different images to glance, a small cirros image and a leap image.
Now when I launch an instance from image and choose to create a new volume for that, it's no problem. But if I try to boot from image without a volume, I get an error on my xen-compute nova-compute.log:
---cut here--- libvirtError: internal error: libxenlight failed to create new domain 'instance-00000229' ---cut here---
and libxl reports:
---cut here--- 2016-05-19 11:38:44 CEST libxl: error: libxl_device.c:300:libxl__device_disk_set_backend: Disk vdev=xvda failed to stat: rbd:images/39c61537-52e5-487c-9ec4-457b3612f549_disk:<CEPH-CREDENTIALS: No such file or directory 2016-05-19 11:38:44 CEST libxl: error: libxl_create.c:930:initiate_domain_create: Unable to set disk defaults for disk 0 ---cut here---
I've been debugging this and found out that the resulting xml config is missing the driver_name, which has to be 'qemu' in this case. I find the libxl error saying there is no such file kind of misleading, but it's the consequence of nova being unaware of the right backenddriver.
I tried different ways to test it, I used the xml config of a failing instance (from debug output) and wanted to "virsh define" that VM, it failed until I added "name='qemu'" to the driver tag. Then as a workaround I added "qemu" directly into the python code, which works fine now.
---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u config.py.dist config.py --- config.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ config.py 2016-05-19 13:34:54.564961720 +0200 @@ -745,6 +745,8 @@
dev.set("type", self.source_type) dev.set("device", self.source_device) + if (self.target_bus == 'xen'): + self.driver_name = 'qemu' if (self.driver_name is not None or self.driver_format is not None or self.driver_cache is not None or ---cut here---
If you use a volume to launch that instance from, there is a function call libvirt_utils.pick_disk_driver_name which also provides qemu to the xml string and the instance gets started successfully.
The difference to kvm is that there is no driver name in the generated xml config, but virsh dumpxml <instance> provides:
---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> ---cut here---
So I have to assume that libxl does not add the driver name to the xen config but it does to the kvm config.
@Jim Fehlig: As I found your name many times in the changelogs of libvirt I hoped you could give some advice or any comment on that.
Regards, Eugen
-- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi, thanks for your quick response!
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates
I'm using libvirt version 1.3.4 from obs://build.opensuse.org/Virtualization. If I use the option boot from image (creates a new volume) it works just fine and that driver_name 'qemu' is provided, but it's missing if I don't use a volume. Regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
Eugen Block wrote:
Hi all,
I'm running Openstack Mitaka in a Leap environment, 1 controller, 2 compute nodes (for testing purposes one with xen, the other with kvm)
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates, or something newer? I added support for network based block devices (including rbd) to the libvirt libxl driver in the libvirt 1.3.2 release cycle, so you'll need libvirt
= 1.3.2 for this to work with xen.
Options are:
* use Tumbleweed * wait for Leap 42.2 * update your Leap xen compute nodes to the packages in the Virtualization repo
http://download.opensuse.org/repositories/Virtualization/openSUSE_Leap_42.1/
The last option requires updating all the virt-related packages. We have no automated tests for such a configuration, so your mileage may vary.
Regards, Jim
and the storage backend is a ceph cluster consisting of three nodes, all the relevant services use ceph as storage backend (glance, cinder-volume, nova). I have uploaded two different images to glance, a small cirros image and a leap image.
Now when I launch an instance from image and choose to create a new volume for that, it's no problem. But if I try to boot from image without a volume, I get an error on my xen-compute nova-compute.log:
---cut here--- libvirtError: internal error: libxenlight failed to create new domain 'instance-00000229' ---cut here---
and libxl reports:
---cut here--- 2016-05-19 11:38:44 CEST libxl: error: libxl_device.c:300:libxl__device_disk_set_backend: Disk vdev=xvda failed to stat: rbd:images/39c61537-52e5-487c-9ec4-457b3612f549_disk:<CEPH-CREDENTIALS: No such file or directory 2016-05-19 11:38:44 CEST libxl: error: libxl_create.c:930:initiate_domain_create: Unable to set disk defaults for disk 0 ---cut here---
I've been debugging this and found out that the resulting xml config is missing the driver_name, which has to be 'qemu' in this case. I find the libxl error saying there is no such file kind of misleading, but it's the consequence of nova being unaware of the right backenddriver.
I tried different ways to test it, I used the xml config of a failing instance (from debug output) and wanted to "virsh define" that VM, it failed until I added "name='qemu'" to the driver tag. Then as a workaround I added "qemu" directly into the python code, which works fine now.
---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u config.py.dist config.py --- config.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ config.py 2016-05-19 13:34:54.564961720 +0200 @@ -745,6 +745,8 @@
dev.set("type", self.source_type) dev.set("device", self.source_device) + if (self.target_bus == 'xen'): + self.driver_name = 'qemu' if (self.driver_name is not None or self.driver_format is not None or self.driver_cache is not None or ---cut here---
If you use a volume to launch that instance from, there is a function call libvirt_utils.pick_disk_driver_name which also provides qemu to the xml string and the instance gets started successfully.
The difference to kvm is that there is no driver name in the generated xml config, but virsh dumpxml <instance> provides:
---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> ---cut here---
So I have to assume that libxl does not add the driver name to the xen config but it does to the kvm config.
@Jim Fehlig: As I found your name many times in the changelogs of libvirt I hoped you could give some advice or any comment on that.
Regards, Eugen
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
I would like to add some more detailed information to my last reply. ---cut here--- compute1:~ # rpm -qi python-nova-13.0.1~a0~dev46-1.1.noarch Name : python-nova Version : 13.0.1~a0~dev46 Release : 1.1 Architecture: noarch Install Date: Di 10 Mai 2016 12:38:57 CEST Group : Development/Languages/Python Size : 16549000 License : Apache-2.0 Signature : RSA/SHA256, Mo 09 Mai 2016 13:39:20 CEST, Key ID 893a90dad85f9316 Source RPM : openstack-nova-13.0.1~a0~dev46-1.1.src.rpm Build Date : Mo 09 Mai 2016 13:38:09 CEST Build Host : cloud113 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Cloud:OpenStack URL : https://launchpad.net/nova Summary : OpenStack Compute (Nova) - Python module Description : This package contains the core Python module of OpenStack Nova. Distribution: Cloud:OpenStack:Mitaka / openSUSE_Leap_42.1 ################################################################# compute1:~ # rpm -qi xen-libs Name : xen-libs Version : 4.7.0_03 Release : 440.1 Architecture: x86_64 Install Date: Di 10 Mai 2016 13:59:52 CEST Group : System/Kernel Size : 1560640 License : GPL-2.0 Signature : RSA/SHA256, Fr 06 Mai 2016 16:33:12 CEST, Key ID a193fbb572174fc2 Source RPM : xen-4.7.0_03-440.1.src.rpm Build Date : Fr 06 Mai 2016 16:31:47 CEST Build Host : build74 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization ################################################################# compute1:~ # rpm -qi qemu-block-rbd Name : qemu-block-rbd Version : 2.5.93 Release : 327.6 Architecture: x86_64 Install Date: Di 10 Mai 2016 14:53:26 CEST Group : System/Emulators/PC Size : 84024 License : BSD-3-Clause and GPL-2.0 and GPL-2.0+ and LGPL-2.1+ and MIT Signature : (none) Source RPM : qemu-2.5.93-327.6.src.rpm Build Date : Di 10 Mai 2016 14:42:57 CEST Build Host : compute1.cloud.hh.nde.ag ---cut here--- As you can see, we're running a self-compiled version of qemu - I'm not sure if it still holds true with above version, but at least with earlier versions, we had to modify the spec file to enable RBD support. Considering your reply, Xen in conjuction with RBD should work, but my tests show that it doesn't. For completeness sake, I ran an additional test case and now see the following behavior: - KVM, boot from volume: "driver_name" provided by Nova - KVM, boot from image: "driver_name" provided by libvirt - Xen, boot from volume: "driver_name" provided by Nova - Xen, boot from image: error, no-one provides "driver_name" So to achieve some kind of consistency it seems that it would be necessary to change libvirt to provide the driver name if an instance is launched from an image without creating a new volume. What is the way to go for me now? Should I file a bug report? Best regards, Eugen Zitat von Eugen Block <eblock@nde.ag>:
Hi, thanks for your quick response!
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates
I'm using libvirt version 1.3.4 from obs://build.opensuse.org/Virtualization.
If I use the option boot from image (creates a new volume) it works just fine and that driver_name 'qemu' is provided, but it's missing if I don't use a volume.
Regards, Eugen
Zitat von Jim Fehlig <jfehlig@suse.com>:
Eugen Block wrote:
Hi all,
I'm running Openstack Mitaka in a Leap environment, 1 controller, 2 compute nodes (for testing purposes one with xen, the other with kvm)
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates, or something newer? I added support for network based block devices (including rbd) to the libvirt libxl driver in the libvirt 1.3.2 release cycle, so you'll need libvirt
= 1.3.2 for this to work with xen.
Options are:
* use Tumbleweed * wait for Leap 42.2 * update your Leap xen compute nodes to the packages in the Virtualization repo
http://download.opensuse.org/repositories/Virtualization/openSUSE_Leap_42.1/
The last option requires updating all the virt-related packages. We have no automated tests for such a configuration, so your mileage may vary.
Regards, Jim
and the storage backend is a ceph cluster consisting of three nodes, all the relevant services use ceph as storage backend (glance, cinder-volume, nova). I have uploaded two different images to glance, a small cirros image and a leap image.
Now when I launch an instance from image and choose to create a new volume for that, it's no problem. But if I try to boot from image without a volume, I get an error on my xen-compute nova-compute.log:
---cut here--- libvirtError: internal error: libxenlight failed to create new domain 'instance-00000229' ---cut here---
and libxl reports:
---cut here--- 2016-05-19 11:38:44 CEST libxl: error: libxl_device.c:300:libxl__device_disk_set_backend: Disk vdev=xvda failed to stat: rbd:images/39c61537-52e5-487c-9ec4-457b3612f549_disk:<CEPH-CREDENTIALS: No such file or directory 2016-05-19 11:38:44 CEST libxl: error: libxl_create.c:930:initiate_domain_create: Unable to set disk defaults for disk 0 ---cut here---
I've been debugging this and found out that the resulting xml config is missing the driver_name, which has to be 'qemu' in this case. I find the libxl error saying there is no such file kind of misleading, but it's the consequence of nova being unaware of the right backenddriver.
I tried different ways to test it, I used the xml config of a failing instance (from debug output) and wanted to "virsh define" that VM, it failed until I added "name='qemu'" to the driver tag. Then as a workaround I added "qemu" directly into the python code, which works fine now.
---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u config.py.dist config.py --- config.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ config.py 2016-05-19 13:34:54.564961720 +0200 @@ -745,6 +745,8 @@
dev.set("type", self.source_type) dev.set("device", self.source_device) + if (self.target_bus == 'xen'): + self.driver_name = 'qemu' if (self.driver_name is not None or self.driver_format is not None or self.driver_cache is not None or ---cut here---
If you use a volume to launch that instance from, there is a function call libvirt_utils.pick_disk_driver_name which also provides qemu to the xml string and the instance gets started successfully.
The difference to kvm is that there is no driver name in the generated xml config, but virsh dumpxml <instance> provides:
---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> ---cut here---
So I have to assume that libxl does not add the driver name to the xen config but it does to the kvm config.
@Jim Fehlig: As I found your name many times in the changelogs of libvirt I hoped you could give some advice or any comment on that.
Regards, Eugen
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag
Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 05/20/2016 04:17 AM, Eugen Block wrote:
For completeness sake, I ran an additional test case and now see the following behavior:
- KVM, boot from volume: "driver_name" provided by Nova - KVM, boot from image: "driver_name" provided by libvirt - Xen, boot from volume: "driver_name" provided by Nova - Xen, boot from image: error, no-one provides "driver_name"
Ah, thanks for this summary. Can you provide the full <disk> config produced by nova for the KVM and Xen boot from image cases? Sensitive info can be replaced with 'xxxx' or something else of your choosing :-).
So to achieve some kind of consistency it seems that it would be necessary to change libvirt to provide the driver name if an instance is launched from an image without creating a new volume.
Network backed disks (<disk type='network .../>) in Xen are only supported via qdisk. So I think it is reasonable for the libvirt libxl driver to use 'qemu' when a driver name is not specified. Also, it should produce an error if the driver name is specified but not equal to qemu.
What is the way to go for me now? Should I file a bug report?
Yes, please. File it under the Virtualization:Tools component and provide the <disk> config requested above, along with other relevant info from this thread. Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi, I filed a bug report (Bug 981094). For completeness, here are the <disk> configs from nova: Xen: ---cut here--- <disk type='network' device='disk'> <driver type='raw' cache='none'/> <auth username='openstack'> <secret type='ceph' uuid='****'/> </auth> <source protocol='rbd' name='images/867682d4-1ff0-4937-9d66-0423ea512679_disk'> <host name='HOST1' port='6789'/> <host name='HOST2' port='6789'/> <host name='HOST3' port='6789'/> </source> <target dev='xvda' bus='xen'/> </disk> ---cut here--- KVM: ---cut here--- <disk type="network" device="disk"> <driver type="raw" cache="none"/> <source protocol="rbd" name="images/36341065-4fbb-4e3f-bd20-db0eb7299832_disk"> <host name="HOST1" port="6789"/> <host name="HOST2" port="6789"/> <host name="HOST3" port="6789"/> </source> <auth username="openstack"> <secret type="ceph" uuid="****"/> </auth> <target bus="virtio" dev="vda"/> </disk> ---cut here--- The resulting <disk> config for kvm contains the driver name: ---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> <secret type='ceph' uuid='****'/> </auth> <source protocol='rbd' name='images/36341065-4fbb-4e3f-bd20-db0eb7299832_disk'> <host name='HOST1' port='6789'/> <host name='HOST2' port='6789'/> <host name='HOST3' port='6789'/> </source> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </disk> ---cut here--- Thanks for your assistance, Jim. I CC'd you to the bug, if that's okay. Best regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
On 05/20/2016 04:17 AM, Eugen Block wrote:
For completeness sake, I ran an additional test case and now see the following behavior:
- KVM, boot from volume: "driver_name" provided by Nova - KVM, boot from image: "driver_name" provided by libvirt - Xen, boot from volume: "driver_name" provided by Nova - Xen, boot from image: error, no-one provides "driver_name"
Ah, thanks for this summary. Can you provide the full <disk> config produced by nova for the KVM and Xen boot from image cases? Sensitive info can be replaced with 'xxxx' or something else of your choosing :-).
So to achieve some kind of consistency it seems that it would be necessary to change libvirt to provide the driver name if an instance is launched from an image without creating a new volume.
Network backed disks (<disk type='network .../>) in Xen are only supported via qdisk. So I think it is reasonable for the libvirt libxl driver to use 'qemu' when a driver name is not specified. Also, it should produce an error if the driver name is specified but not equal to qemu.
What is the way to go for me now? Should I file a bug report?
Yes, please. File it under the Virtualization:Tools component and provide the <disk> config requested above, along with other relevant info from this thread.
Regards, Jim
-- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 05/23/2016 05:17 AM, Eugen Block wrote:
Hi,
I filed a bug report (Bug 981094).
Thanks. Any further discussions can take place in the bug. Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi all, I would like to follow up on an issue I had a couple of months ago with nova libvirt. During my tests with SUSE Cloud 5 there was a problem launching Xen VMs, the wrong bootloader was selected for images with version >= SLE12. There was a service request for that issue, ServiceRequestID 10969858451. The proposed fix was this piece of code in python-nova/virt/libvirt/driver: ---cut here--- + if CONF.libvirt.virt_type == 'xen': + guest.os_kernel = '/usr/lib/grub2/x86_64-xen/grub.xen' ---cut here--- This worked just fine, but now that I use Openstack (Mitaka) Cloud, I run into the same issue. So everytime I update my packages, I first have to patch the driver.py on my compute nodes. ---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.dist /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py --- /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ /usr/lib/python2.7/site-packages/nova/virt/libvirt/driver.py 2016-05-27 14:01:11.210389066 +0200 @@ -4369,6 +4372,7 @@ if virt_type == "xen": if guest.os_type == vm_mode.HVM: guest.os_loader = CONF.libvirt.xen_hvmloader_path + guest.os_kernel = '/usr/lib/grub2/x86_64-xen/grub.xen' elif virt_type in ("kvm", "qemu"): if caps.host.cpu.arch in (arch.I686, arch.X86_64): guest.sysinfo = self._get_guest_config_sysinfo(instance) ---cut here--- I know there is an existing bug for this [1], but unfortunately I'm not allowed to view any content, so I don't know how to proceed further. Should I create a new bug for Openstack in launchpad? Regards, Eugen [1] https://bugzilla.opensuse.org/show_bug.cgi?id=945453 -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi Eugen, On Mon, May 30, 2016 at 09:21:36AM +0200, Eugen Block wrote:
Hi all,
I would like to follow up on an issue I had a couple of months ago with nova libvirt. During my tests with SUSE Cloud 5 there was a problem launching Xen VMs, the wrong bootloader was selected for images with version >= SLE12. There was a service request for that issue, ServiceRequestID 10969858451. The proposed fix was this piece of code in python-nova/virt/libvirt/driver:
---cut here--- + if CONF.libvirt.virt_type == 'xen': + guest.os_kernel = '/usr/lib/grub2/x86_64-xen/grub.xen' ---cut here---
This worked just fine, but now that I use Openstack (Mitaka) Cloud, I run into the same issue. So everytime I update my packages, I first have to patch the driver.py on my compute nodes.
You don't need this patch. Install the grub2-xen package and create a glance image with the kernel from the package (/usr/lib/grub2/x86_64-xen/grub.xen ). Then for your Xen image, add the kernel-uuid from the previously created image (glance image-update grub-xen-uuid --property kernel_id=xen-image-uuid ). Now the image should boot with the correct kernel. Best, Tom -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi Thomas, thanks for the quick response! I ran some tests to verify your suggestion. Please notice that we are using Ceph as storage backend, that's why I had to run the tests for SLE11 twice. Here is a summary of my tests: cirrOS: w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: fails SLE12: w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works SLE11 (Image in RBD pool): w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works SLE11 (Image in local fs): w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: works The SLE12 image did not boot without my patch, I updated the image property with the grub.xen uuid as suggested, that worked quite well. I tried the same with a cirros image, it did not boot with the kernel-id property. It's just an image for testing purposes, but extremely helpful for quick tests, so I need it to work, too. Then I tested a SLE11 image to see if older images are still working. Without modifications in driver.py or the glance image, the VM wouldn't boot. Is it possible that pygrub is not capable of working with network backed images? Because when I switched back to local file system, the VM boots without any problems, also without any changes. It worked both with a kernel-id and with a code change for the rbd backed image. So in general, your suggestion works. But regarding the maintenance of a cloud, it seems not very handy. If I understand correctly, in case of an grub update I would have to upload a new grub image to glance, then replace the reference to that old kernel-id with the newly updloaded kernel-id. That doesn't sound very useful, to be honest. Although the code change is very practical for me, I have to admit that it's not a general solution as it doesn't take into account any system relevant information besides the virt_type. But let's say you would build a real patch that contains some information on how to choose the correct bootloader, wouldn't that be more practical? Another question comes to my mind: is there any reason to stick with pygrub instead of using grub.xen? To me it seems that grub.xen is more powerful (e.g. booting rbd images), and if IIRC that's what SUSE did in the python-nova package, at least in the patch I received within the service request mentioned before. Is there any reason I'm not seeing right now to not use grub.xen as default? Regards, Eugen Zitat von Thomas Bechtold <tbechtold@suse.com>:
Hi Eugen,
On Mon, May 30, 2016 at 09:21:36AM +0200, Eugen Block wrote:
Hi all,
I would like to follow up on an issue I had a couple of months ago with nova libvirt. During my tests with SUSE Cloud 5 there was a problem launching Xen VMs, the wrong bootloader was selected for images with version >= SLE12. There was a service request for that issue, ServiceRequestID 10969858451. The proposed fix was this piece of code in python-nova/virt/libvirt/driver:
---cut here--- + if CONF.libvirt.virt_type == 'xen': + guest.os_kernel = '/usr/lib/grub2/x86_64-xen/grub.xen' ---cut here---
This worked just fine, but now that I use Openstack (Mitaka) Cloud, I run into the same issue. So everytime I update my packages, I first have to patch the driver.py on my compute nodes.
You don't need this patch. Install the grub2-xen package and create a glance image with the kernel from the package (/usr/lib/grub2/x86_64-xen/grub.xen ). Then for your Xen image, add the kernel-uuid from the previously created image (glance image-update grub-xen-uuid --property kernel_id=xen-image-uuid ). Now the image should boot with the correct kernel.
Best,
Tom
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Eugen Block wrote:
Hi Thomas,
thanks for the quick response! I ran some tests to verify your suggestion. Please notice that we are using Ceph as storage backend, that's why I had to run the tests for SLE11 twice.
Here is a summary of my tests:
cirrOS: w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: fails
SLE12: w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works
SLE11 (Image in RBD pool): w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works
SLE11 (Image in local fs): w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: works
The SLE12 image did not boot without my patch, I updated the image property with the grub.xen uuid as suggested, that worked quite well. I tried the same with a cirros image, it did not boot with the kernel-id property. It's just an image for testing purposes, but extremely helpful for quick tests, so I need it to work, too.
Then I tested a SLE11 image to see if older images are still working. Without modifications in driver.py or the glance image, the VM wouldn't boot. Is it possible that pygrub is not capable of working with network backed images?
pygrub should work with network-based block devices. Is your xen compute node running latest xen and libvirt, including the fix for bug#981094? Regardless, /var/log/libvirt/libxl/libxl-driver.log (and perhaps related logs in /var/log/xen/) should contain some hints as to what failed.
Because when I switched back to local file system, the VM boots without any problems, also without any changes. It worked both with a kernel-id and with a code change for the rbd backed image.
So in general, your suggestion works. But regarding the maintenance of a cloud, it seems not very handy. If I understand correctly, in case of an grub update I would have to upload a new grub image to glance, then replace the reference to that old kernel-id with the newly updloaded kernel-id. That doesn't sound very useful, to be honest.
Although the code change is very practical for me, I have to admit that it's not a general solution as it doesn't take into account any system relevant information besides the virt_type. But let's say you would build a real patch that contains some information on how to choose the correct bootloader, wouldn't that be more practical?
Another question comes to my mind: is there any reason to stick with pygrub instead of using grub.xen?
IMO, grub.xen should always be used for PV instances in a cloud environment. pygrub mounts the image in the compute node (dom0) to extract the kernel/initrd, which unnecessarily exposes it to potential vulnerabilities due to a rouge image. Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi,
Is your xen compute node running latest xen and libvirt, including the fix for bug#981094?
Yes it does: ---cut here--- compute1:~ # rpm -qi libvirt-1.3.4-573.1.x86_64 Name : libvirt Version : 1.3.4 Release : 573.1 Architecture: x86_64 Install Date: Do 26 Mai 2016 14:55:31 CEST Group : Development/Libraries/C and C++ Size : 106 License : LGPL-2.1+ Signature : RSA/SHA256, Mi 25 Mai 2016 19:31:20 CEST, Key ID a193fbb572174fc2 Source RPM : libvirt-1.3.4-573.1.src.rpm Build Date : Mi 25 Mai 2016 19:30:03 CEST Build Host : build78 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization URL : http://libvirt.org/ ############################################## compute1:~ # rpm -qi xen-libs Name : xen-libs Version : 4.7.0_03 Release : 440.1 Architecture: x86_64 Install Date: Di 10 Mai 2016 13:59:52 CEST Group : System/Kernel Size : 1560640 License : GPL-2.0 Signature : RSA/SHA256, Fr 06 Mai 2016 16:33:12 CEST, Key ID a193fbb572174fc2 Source RPM : xen-4.7.0_03-440.1.src.rpm Build Date : Fr 06 Mai 2016 16:31:47 CEST Build Host : build74 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization ---cut here--- Here are the logs from /var/log/libvirt/libxl/libxl-driver.log and /var/log/xen/bootloader.26.log ---cut here--- compute1:~ # tail /var/log/libvirt/libxl/libxl-driver.log 2016-06-02 10:32:59 CEST libxl: error: libxl_bootloader.c:635:bootloader_finished: bootloader failed - consult logfile /var/log/xen/bootloader.26.log 2016-06-02 10:32:59 CEST libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: bootloader [28303] exited with error status 1 2016-06-02 10:32:59 CEST libxl: error: libxl_create.c:1222:domcreate_rebuild_done: cannot (re-)build domain: -3 ############################################## compute1:~ # tail /var/log/xen/bootloader.26.log Traceback (most recent call last): File "/usr/lib/xen/bin/pygrub", line 923, in <module> part_offs = get_partition_offsets(file) File "/usr/lib/xen/bin/pygrub", line 114, in get_partition_offsets image_type = identify_disk_image(file) File "/usr/lib/xen/bin/pygrub", line 57, in identify_disk_image fd = os.open(file, os.O_RDONLY) OSError: [Errno 2] No such file or directory: 'rbd:images/551a1dd6-ce9b-44c9-87f4-c2058efd94f6_disk:id=openstack:key=<KEY>:auth_supported=cephx\\;none:mon_host=<HOST1>\\:6789\\;<HOST2>\\:6789\\;<HOST3>\\:6789' ---cut here--- The output from libxl-driver.log is always the first thing I check if an instance fails to boot, and if I read pygrub, I add that described patch to driver.py to use grub.xen. Regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
Eugen Block wrote:
Hi Thomas,
thanks for the quick response! I ran some tests to verify your suggestion. Please notice that we are using Ceph as storage backend, that's why I had to run the tests for SLE11 twice.
Here is a summary of my tests:
cirrOS: w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: fails
SLE12: w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works
SLE11 (Image in RBD pool): w/o changes in driver.py: fails w/ changes in driver.py: works w/ glance kernel-id: works
SLE11 (Image in local fs): w/o changes in driver.py: works w/ changes in driver.py: works w/ glance kernel-id: works
The SLE12 image did not boot without my patch, I updated the image property with the grub.xen uuid as suggested, that worked quite well. I tried the same with a cirros image, it did not boot with the kernel-id property. It's just an image for testing purposes, but extremely helpful for quick tests, so I need it to work, too.
Then I tested a SLE11 image to see if older images are still working. Without modifications in driver.py or the glance image, the VM wouldn't boot. Is it possible that pygrub is not capable of working with network backed images?
pygrub should work with network-based block devices. Is your xen compute node running latest xen and libvirt, including the fix for bug#981094? Regardless, /var/log/libvirt/libxl/libxl-driver.log (and perhaps related logs in /var/log/xen/) should contain some hints as to what failed.
Because when I switched back to local file system, the VM boots without any problems, also without any changes. It worked both with a kernel-id and with a code change for the rbd backed image.
So in general, your suggestion works. But regarding the maintenance of a cloud, it seems not very handy. If I understand correctly, in case of an grub update I would have to upload a new grub image to glance, then replace the reference to that old kernel-id with the newly updloaded kernel-id. That doesn't sound very useful, to be honest.
Although the code change is very practical for me, I have to admit that it's not a general solution as it doesn't take into account any system relevant information besides the virt_type. But let's say you would build a real patch that contains some information on how to choose the correct bootloader, wouldn't that be more practical?
Another question comes to my mind: is there any reason to stick with pygrub instead of using grub.xen?
IMO, grub.xen should always be used for PV instances in a cloud environment. pygrub mounts the image in the compute node (dom0) to extract the kernel/initrd, which unnecessarily exposes it to potential vulnerabilities due to a rouge image.
Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 06/02/2016 02:48 AM, Eugen Block wrote:
Hi,
Is your xen compute node running latest xen and libvirt, including the fix for bug#981094?
Yes it does:
---cut here--- compute1:~ # rpm -qi libvirt-1.3.4-573.1.x86_64 Name : libvirt Version : 1.3.4 Release : 573.1 Architecture: x86_64 Install Date: Do 26 Mai 2016 14:55:31 CEST Group : Development/Libraries/C and C++ Size : 106 License : LGPL-2.1+ Signature : RSA/SHA256, Mi 25 Mai 2016 19:31:20 CEST, Key ID a193fbb572174fc2 Source RPM : libvirt-1.3.4-573.1.src.rpm Build Date : Mi 25 Mai 2016 19:30:03 CEST Build Host : build78 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization URL : http://libvirt.org/
##############################################
compute1:~ # rpm -qi xen-libs Name : xen-libs Version : 4.7.0_03 Release : 440.1 Architecture: x86_64 Install Date: Di 10 Mai 2016 13:59:52 CEST Group : System/Kernel Size : 1560640 License : GPL-2.0 Signature : RSA/SHA256, Fr 06 Mai 2016 16:33:12 CEST, Key ID a193fbb572174fc2 Source RPM : xen-4.7.0_03-440.1.src.rpm Build Date : Fr 06 Mai 2016 16:31:47 CEST Build Host : build74 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization ---cut here---
Here are the logs from /var/log/libvirt/libxl/libxl-driver.log and /var/log/xen/bootloader.26.log
---cut here--- compute1:~ # tail /var/log/libvirt/libxl/libxl-driver.log 2016-06-02 10:32:59 CEST libxl: error: libxl_bootloader.c:635:bootloader_finished: bootloader failed - consult logfile /var/log/xen/bootloader.26.log 2016-06-02 10:32:59 CEST libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: bootloader [28303] exited with error status 1 2016-06-02 10:32:59 CEST libxl: error: libxl_create.c:1222:domcreate_rebuild_done: cannot (re-)build domain: -3
##############################################
compute1:~ # tail /var/log/xen/bootloader.26.log Traceback (most recent call last): File "/usr/lib/xen/bin/pygrub", line 923, in <module> part_offs = get_partition_offsets(file) File "/usr/lib/xen/bin/pygrub", line 114, in get_partition_offsets image_type = identify_disk_image(file) File "/usr/lib/xen/bin/pygrub", line 57, in identify_disk_image fd = os.open(file, os.O_RDONLY) OSError: [Errno 2] No such file or directory: 'rbd:images/551a1dd6-ce9b-44c9-87f4-c2058efd94f6_disk:id=openstack:key=<KEY>:auth_supported=cephx\\;none:mon_host=<HOST1>\\:6789\\;<HOST2>\\:6789\\;<HOST3>\\:6789'
Heh, I'd expect open(2) to fail on such a filename :-). I was thinking the dom0 bootloader code would 'block-attach' the rbd device to dom0, allowing pygrub to access it. I guess I was wrong about pygrub working with network-based block devices. But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow. Note also that pygrub doesn't know how to grok btrfs, so it wouldn't work for images on the local fs that contain btrfs either. Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow.
So my approach to add that line to the code in driver.py is not that bad, is it? :-) Wouldn't that be a reason to consider adding this as a real patch as described in the previous mails? Regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
On 06/02/2016 02:48 AM, Eugen Block wrote:
Hi,
Is your xen compute node running latest xen and libvirt, including the fix for bug#981094?
Yes it does:
---cut here--- compute1:~ # rpm -qi libvirt-1.3.4-573.1.x86_64 Name : libvirt Version : 1.3.4 Release : 573.1 Architecture: x86_64 Install Date: Do 26 Mai 2016 14:55:31 CEST Group : Development/Libraries/C and C++ Size : 106 License : LGPL-2.1+ Signature : RSA/SHA256, Mi 25 Mai 2016 19:31:20 CEST, Key ID a193fbb572174fc2 Source RPM : libvirt-1.3.4-573.1.src.rpm Build Date : Mi 25 Mai 2016 19:30:03 CEST Build Host : build78 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization URL : http://libvirt.org/
##############################################
compute1:~ # rpm -qi xen-libs Name : xen-libs Version : 4.7.0_03 Release : 440.1 Architecture: x86_64 Install Date: Di 10 Mai 2016 13:59:52 CEST Group : System/Kernel Size : 1560640 License : GPL-2.0 Signature : RSA/SHA256, Fr 06 Mai 2016 16:33:12 CEST, Key ID a193fbb572174fc2 Source RPM : xen-4.7.0_03-440.1.src.rpm Build Date : Fr 06 Mai 2016 16:31:47 CEST Build Host : build74 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization ---cut here---
Here are the logs from /var/log/libvirt/libxl/libxl-driver.log and /var/log/xen/bootloader.26.log
---cut here--- compute1:~ # tail /var/log/libvirt/libxl/libxl-driver.log 2016-06-02 10:32:59 CEST libxl: error: libxl_bootloader.c:635:bootloader_finished: bootloader failed - consult logfile /var/log/xen/bootloader.26.log 2016-06-02 10:32:59 CEST libxl: error: libxl_exec.c:118:libxl_report_child_exitstatus: bootloader [28303] exited with error status 1 2016-06-02 10:32:59 CEST libxl: error: libxl_create.c:1222:domcreate_rebuild_done: cannot (re-)build domain: -3
##############################################
compute1:~ # tail /var/log/xen/bootloader.26.log Traceback (most recent call last): File "/usr/lib/xen/bin/pygrub", line 923, in <module> part_offs = get_partition_offsets(file) File "/usr/lib/xen/bin/pygrub", line 114, in get_partition_offsets image_type = identify_disk_image(file) File "/usr/lib/xen/bin/pygrub", line 57, in identify_disk_image fd = os.open(file, os.O_RDONLY) OSError: [Errno 2] No such file or directory: 'rbd:images/551a1dd6-ce9b-44c9-87f4-c2058efd94f6_disk:id=openstack:key=<KEY>:auth_supported=cephx\\;none:mon_host=<HOST1>\\:6789\\;<HOST2>\\:6789\\;<HOST3>\\:6789'
Heh, I'd expect open(2) to fail on such a filename :-). I was thinking the dom0 bootloader code would 'block-attach' the rbd device to dom0, allowing pygrub to access it. I guess I was wrong about pygrub working with network-based block devices.
But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow. Note also that pygrub doesn't know how to grok btrfs, so it wouldn't work for images on the local fs that contain btrfs either.
Regards, Jim
-- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 06/02/2016 09:29 AM, Eugen Block wrote:
But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow.
So my approach to add that line to the code in driver.py is not that bad, is it? :-)
No, not at all IMO.
Wouldn't that be a reason to consider adding this as a real patch as described in the previous mails?
Yes. But I'm out of touch with nova development these days. Maybe one of the cloud folks could push that upstream. Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On Thu, Jun 02, 2016 at 09:35:02AM -0600, Jim Fehlig wrote:
On 06/02/2016 09:29 AM, Eugen Block wrote:
But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow.
So my approach to add that line to the code in driver.py is not that bad, is it? :-)
No, not at all IMO.
Wouldn't that be a reason to consider adding this as a real patch as described in the previous mails?
Yes. But I'm out of touch with nova development these days. Maybe one of the cloud folks could push that upstream.
I tried that already. See https://review.openstack.org/#/c/264101 . Cheers, Tom -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
I tried that already. See https://review.openstack.org/#/c/264101
I added a comment referring to this thread to apply that change. It's required in our cloud environment, I hope it will be processed. Regards, Eugen Zitat von Thomas Bechtold <tbechtold@suse.com>:
On Thu, Jun 02, 2016 at 09:35:02AM -0600, Jim Fehlig wrote:
But as mentioned before, for security reasons pvgrub (aka grub.xen) should be used anyhow.
So my approach to add that line to the code in driver.py is not
On 06/02/2016 09:29 AM, Eugen Block wrote: that bad, is
it? :-)
No, not at all IMO.
Wouldn't that be a reason to consider adding this as a real patch as described in the previous mails?
Yes. But I'm out of touch with nova development these days. Maybe one of the cloud folks could push that upstream.
I tried that already. See https://review.openstack.org/#/c/264101 .
Cheers,
Tom -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi list, I'm working with ceph as storage backend for my cloud environment, that works fine. But when I try to get ceilometer meters for my rbd resources I get errors in ceilometer-polling.log ---cut here--- 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk [req-823d46da-5a23-45bd-a0e9-596e7d4a9fc3 admin - - - -] Ignoring instance instance-000002d7 (51d7bfdc-feec-4f13-ad0c-190dcfa2c62d) : this function is not supported by the connection driver: virDomainGetBlockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk Traceback (most recent call last): 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 625, in get_samples 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk instance, 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 567, in _populate_cache 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk for disk, info in disk_info: 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/virt/libvirt/inspector.py", line 215, in inspect_disk_info 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk block_info = domain.blockInfo(device) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib64/python2.7/site-packages/libvirt.py", line 690, in blockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk libvirtError: this function is not supported by the connection driver: virDomainGetBlockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk ---cut here--- This bug [1] describes the issue, but it seems to be a libvirt issue, not ceilometer. So I wanted to ask if someone knows (maybe Jim Fehlig :-D ) if somebody is working on it, if there's already a fix or any information at all. I believe I have completed all required steps, installed rados-gateway including running apache, integrated keystone with the radosgw, added meters to /etc/ceilometer/pipeline.yaml, but I don't get any rbd meters. I tried on both xen and kvm compute nodes: KVM: compute2:~ # rpm -qi libvirt-1.2.18.2-8.1.x86_64 Name : libvirt Version : 1.2.18.2 Release : 8.1 Architecture: x86_64 Install Date: Mo 27 Jun 2016 11:44:18 CEST Group : Development/Libraries/C and C++ Size : 106 License : LGPL-2.1+ Signature : RSA/SHA256, Fr 17 Jun 2016 13:54:18 CEST, Key ID b88b2fd43dbdc284 Source RPM : libvirt-1.2.18.2-8.1.src.rpm Build Date : Fr 17 Jun 2016 13:51:46 CEST Build Host : build21 Relocations : (not relocatable) Packager : http://bugs.opensuse.org Vendor : openSUSE XEN: compute3:~ # rpm -qi libvirt-1.3.5-587.1.x86_64 Name : libvirt Version : 1.3.5 Release : 587.1 Architecture: x86_64 Install Date: Mo 27 Jun 2016 11:47:22 CEST Group : Development/Libraries/C and C++ Size : 106 License : LGPL-2.1+ Signature : RSA/SHA256, Fr 24 Jun 2016 19:54:06 CEST, Key ID a193fbb572174fc2 Source RPM : libvirt-1.3.5-587.1.src.rpm Build Date : Fr 24 Jun 2016 19:52:28 CEST Build Host : cloud103 Relocations : (not relocatable) Vendor : obs://build.opensuse.org/Virtualization I'd appreciate any information on that! Best regards, Eugen [1] https://bugs.launchpad.net/ceilometer/+bug/1457440 -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 06/29/2016 08:40 AM, Eugen Block wrote:
Hi list,
I'm working with ceph as storage backend for my cloud environment, that works fine. But when I try to get ceilometer meters for my rbd resources I get errors in ceilometer-polling.log
---cut here--- 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk [req-823d46da-5a23-45bd-a0e9-596e7d4a9fc3 admin - - - -] Ignoring instance instance-000002d7 (51d7bfdc-feec-4f13-ad0c-190dcfa2c62d) : this function is not supported by the connection driver: virDomainGetBlockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk Traceback (most recent call last): 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 625, in get_samples 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk instance, 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 567, in _populate_cache 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk for disk, info in disk_info: 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/virt/libvirt/inspector.py", line 215, in inspect_disk_info 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk block_info = domain.blockInfo(device) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib64/python2.7/site-packages/libvirt.py", line 690, in blockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk libvirtError: this function is not supported by the connection driver: virDomainGetBlockInfo
Yep, the domainGetBlockInfo function is not implemented in the libvirt libxl driver.
2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk ---cut here---
This bug [1] describes the issue, but it seems to be a libvirt issue, not ceilometer. So I wanted to ask if someone knows (maybe Jim Fehlig :-D ) if somebody is working on it, if there's already a fix or any information at all.
I'm not aware of anyone working on an implementation for the libxl driver. Patches are welcome if this is something you are able to do :-). Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Thanks for your reply, Jim.
Patches are welcome if this is something you are able to do
I don't think I am, but it's worth trying. ;-) But I'll be on vacation from Monday until the end of July, just fyi. Regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
On 06/29/2016 08:40 AM, Eugen Block wrote:
Hi list,
I'm working with ceph as storage backend for my cloud environment, that works fine. But when I try to get ceilometer meters for my rbd resources I get errors in ceilometer-polling.log
---cut here--- 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk [req-823d46da-5a23-45bd-a0e9-596e7d4a9fc3 admin - - - -] Ignoring instance instance-000002d7 (51d7bfdc-feec-4f13-ad0c-190dcfa2c62d) : this function is not supported by the connection driver: virDomainGetBlockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk Traceback (most recent call last): 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 625, in get_samples 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk instance, 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/pollsters/disk.py", line 567, in _populate_cache 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk for disk, info in disk_info: 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib/python2.7/site-packages/ceilometer/compute/virt/libvirt/inspector.py", line 215, in inspect_disk_info 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk block_info = domain.blockInfo(device) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk File "/usr/lib64/python2.7/site-packages/libvirt.py", line 690, in blockInfo 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk if ret is None: raise libvirtError ('virDomainGetBlockInfo() failed', dom=self) 2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk libvirtError: this function is not supported by the connection driver: virDomainGetBlockInfo
Yep, the domainGetBlockInfo function is not implemented in the libvirt libxl driver.
2016-06-29 15:45:19.711 29483 ERROR ceilometer.compute.pollsters.disk ---cut here---
This bug [1] describes the issue, but it seems to be a libvirt issue, not ceilometer. So I wanted to ask if someone knows (maybe Jim Fehlig :-D ) if somebody is working on it, if there's already a fix or any information at all.
I'm not aware of anyone working on an implementation for the libxl driver. Patches are welcome if this is something you are able to do :-).
Regards, Jim
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi, I have to update our xen compute nodes (OpenStack Mitaka on Leap 42.1, Ceph as storage backend), so I tried that with the first one. Now libvirt updated to version 3.0.0 and qemu to version 2.7.0 (from repo [1]), and as usual, we had to build the qemu-block-rbd package ourselves. Unfortunately, the compute node doesn't want to start instances anymore. Although nova reports "Instance started successfully", qemu reports: ---cut here--- compute3:~ # tail /var/log/xen/qemu-dm-minisuse.log xen be: qdisk-51712: xen be: qdisk-51712: error: Unknown protocol 'rbd' error: Unknown protocol 'rbd' xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed initialise() failed ---cut here--- I already know that message and it was the reason for our previous conversation here in this mailing list. Now I'm wondering, why we're at the same point again and what can I do to get that compute back up? The other xen compute node has libvirt version 2.5.0-617.1 installed, together with self-compiled qemu-2.7.0. Should I try to downgrade libvirt back to 2.5? Or has rbd support been disabled for version 3.0? Thanks for any hints! Regards, Eugen [1] http://download.opensuse.org/repositories/Virtualization/openSUSE_Leap_42.1/ Zitat von Jim Fehlig <jfehlig@suse.com>:
Eugen Block wrote:
Hi all,
I'm running Openstack Mitaka in a Leap environment, 1 controller, 2 compute nodes (for testing purposes one with xen, the other with kvm)
Are you using libvirt 1.2.18 and xen 4.5 from the Leap updates, or something newer? I added support for network based block devices (including rbd) to the libvirt libxl driver in the libvirt 1.3.2 release cycle, so you'll need libvirt
= 1.3.2 for this to work with xen.
Options are:
* use Tumbleweed * wait for Leap 42.2 * update your Leap xen compute nodes to the packages in the Virtualization repo
http://download.opensuse.org/repositories/Virtualization/openSUSE_Leap_42.1/
The last option requires updating all the virt-related packages. We have no automated tests for such a configuration, so your mileage may vary.
Regards, Jim
and the storage backend is a ceph cluster consisting of three nodes, all the relevant services use ceph as storage backend (glance, cinder-volume, nova). I have uploaded two different images to glance, a small cirros image and a leap image.
Now when I launch an instance from image and choose to create a new volume for that, it's no problem. But if I try to boot from image without a volume, I get an error on my xen-compute nova-compute.log:
---cut here--- libvirtError: internal error: libxenlight failed to create new domain 'instance-00000229' ---cut here---
and libxl reports:
---cut here--- 2016-05-19 11:38:44 CEST libxl: error: libxl_device.c:300:libxl__device_disk_set_backend: Disk vdev=xvda failed to stat: rbd:images/39c61537-52e5-487c-9ec4-457b3612f549_disk:<CEPH-CREDENTIALS: No such file or directory 2016-05-19 11:38:44 CEST libxl: error: libxl_create.c:930:initiate_domain_create: Unable to set disk defaults for disk 0 ---cut here---
I've been debugging this and found out that the resulting xml config is missing the driver_name, which has to be 'qemu' in this case. I find the libxl error saying there is no such file kind of misleading, but it's the consequence of nova being unaware of the right backenddriver.
I tried different ways to test it, I used the xml config of a failing instance (from debug output) and wanted to "virsh define" that VM, it failed until I added "name='qemu'" to the driver tag. Then as a workaround I added "qemu" directly into the python code, which works fine now.
---cut here--- compute1:/usr/lib/python2.7/site-packages/nova/virt/libvirt # diff -u config.py.dist config.py --- config.py.dist 2016-05-07 20:35:01.000000000 +0200 +++ config.py 2016-05-19 13:34:54.564961720 +0200 @@ -745,6 +745,8 @@
dev.set("type", self.source_type) dev.set("device", self.source_device) + if (self.target_bus == 'xen'): + self.driver_name = 'qemu' if (self.driver_name is not None or self.driver_format is not None or self.driver_cache is not None or ---cut here---
If you use a volume to launch that instance from, there is a function call libvirt_utils.pick_disk_driver_name which also provides qemu to the xml string and the instance gets started successfully.
The difference to kvm is that there is no driver name in the generated xml config, but virsh dumpxml <instance> provides:
---cut here--- <disk type='network' device='disk'> <driver name='qemu' type='raw' cache='none'/> <auth username='openstack'> ---cut here---
So I have to assume that libxl does not add the driver name to the xen config but it does to the kvm config.
@Jim Fehlig: As I found your name many times in the changelogs of libvirt I hoped you could give some advice or any comment on that.
Regards, Eugen
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 02/17/2017 09:39 AM, Eugen Block wrote:
Hi,
I have to update our xen compute nodes (OpenStack Mitaka on Leap 42.1, Ceph as storage backend), so I tried that with the first one. Now libvirt updated to version 3.0.0 and qemu to version 2.7.0 (from repo [1]), and as usual, we had to build the qemu-block-rbd package ourselves.
It sounds a bit like qemu and your self-built qemu-block-rbd are not playing well together. Is Xen also updated to 4.8? The latest of all these packages on 42.1 is not well tested :-).
Unfortunately, the compute node doesn't want to start instances anymore. Although nova reports "Instance started successfully", qemu reports:
---cut here--- compute3:~ # tail /var/log/xen/qemu-dm-minisuse.log xen be: qdisk-51712: xen be: qdisk-51712: error: Unknown protocol 'rbd' error: Unknown protocol 'rbd' xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed initialise() failed ---cut here---
I already know that message and it was the reason for our previous conversation here in this mailing list. Now I'm wondering, why we're at the same point again and what can I do to get that compute back up? The other xen compute node has libvirt version 2.5.0-617.1 installed, together with self-compiled qemu-2.7.0. Should I try to downgrade libvirt back to 2.5? Or has rbd support been disabled for version 3.0?
No. From the build log you can see rbd support in libvirt is enabled [ 101s] configure: RBD: yes Regards, Jim -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Is Xen also updated to 4.8?
Yes, this was also updated to 4.8. But my colleague managed to help me with this and was able to downgrade xen back to 4.7 and libvirt back to 2.5. One of the other compute nodes was upgraded to Leap 42.2 without self-built qemu-block-rbd (it's already enabled), and it works like a charm! It runs these versions: xen-4.7.1_04-6.1.x86_64 qemu-block-rbd-2.6.2-26.1.x86_64 libvirtd (libvirt) 2.0.0 So we decided to test it with Leap-42.2 and if it works, probably upgrade the other compute nodes, too. Regards, Eugen Zitat von Jim Fehlig <jfehlig@suse.com>:
On 02/17/2017 09:39 AM, Eugen Block wrote:
Hi,
I have to update our xen compute nodes (OpenStack Mitaka on Leap 42.1, Ceph as storage backend), so I tried that with the first one. Now libvirt updated to version 3.0.0 and qemu to version 2.7.0 (from repo [1]), and as usual, we had to build the qemu-block-rbd package ourselves.
It sounds a bit like qemu and your self-built qemu-block-rbd are not playing well together.
Is Xen also updated to 4.8? The latest of all these packages on 42.1 is not well tested :-).
Unfortunately, the compute node doesn't want to start instances anymore. Although nova reports "Instance started successfully", qemu reports:
---cut here--- compute3:~ # tail /var/log/xen/qemu-dm-minisuse.log xen be: qdisk-51712: xen be: qdisk-51712: error: Unknown protocol 'rbd' error: Unknown protocol 'rbd' xen be: qdisk-51712: xen be: qdisk-51712: initialise() failed initialise() failed ---cut here---
I already know that message and it was the reason for our previous conversation here in this mailing list. Now I'm wondering, why we're at the same point again and what can I do to get that compute back up? The other xen compute node has libvirt version 2.5.0-617.1 installed, together with self-compiled qemu-2.7.0. Should I try to downgrade libvirt back to 2.5? Or has rbd support been disabled for version 3.0?
No. From the build log you can see rbd support in libvirt is enabled
[ 101s] configure: RBD: yes
Regards, Jim
-- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi list, this is just a test. I am subscribed to this list, but did not get the notification about Pike packages available for Leap 42.3 yet. Should I unsubscribe and re-subscribe again? Regards, Eugen -- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Eugen Block <eblock@nde.ag> wrote:
Hi list,
this is just a test. I am subscribed to this list, but did not get the notification about Pike packages available for Leap 42.3 yet. Should I unsubscribe and re-subscribe again?
No, I don't think re-subscribing would change anything. Was there a specific notification you were expecting? In general I think we're just doing a bad job of using this mailing list; instead a lot of activity generated by SUSE's larger-than-ever Cloud team is hidden away on internal SUSE mailing lists. I'd encourage everyone to complain loudly until this changes ;-) I'll also post a reminder to the internal lists, encouraging more frequent usage of this list. We really do want to build the openSUSE cloud community further, despite what it may look like ;-) -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Hi Adam, I received my own email yesterday, so I guess there's no need to do anything. ;-)
No, I don't think re-subscribing would change anything. Was there a specific notification you were expecting?
I was just wondering because my colleague received the email announcing the Pike packages and told me about it, but I didn't get anything although all other mails addressed to this list are coming through. It's quite mysterious, but since it seems to be fine now, I'll leave it with that. ;-) Regards, Eugen Zitat von Adam Spiers <aspiers@suse.com>:
Eugen Block <eblock@nde.ag> wrote:
Hi list,
this is just a test. I am subscribed to this list, but did not get the notification about Pike packages available for Leap 42.3 yet. Should I unsubscribe and re-subscribe again?
No, I don't think re-subscribing would change anything. Was there a specific notification you were expecting?
In general I think we're just doing a bad job of using this mailing list; instead a lot of activity generated by SUSE's larger-than-ever Cloud team is hidden away on internal SUSE mailing lists. I'd encourage everyone to complain loudly until this changes ;-)
I'll also post a reminder to the internal lists, encouraging more frequent usage of this list. We really do want to build the openSUSE cloud community further, despite what it may look like ;-)
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Eugen Block <eblock@nde.ag> wrote:
Hi Adam,
I received my own email yesterday, so I guess there's no need to do anything. ;-)
No, I don't think re-subscribing would change anything. Was there a specific notification you were expecting?
I was just wondering because my colleague received the email announcing the Pike packages and told me about it, but I didn't get anything although all other mails addressed to this list are coming through. It's quite mysterious, but since it seems to be fine now, I'll leave it with that. ;-)
Oh, I see. That email did reach my inbox, but I had forgotten about it. It definitely reached the list too: https://lists.opensuse.org/opensuse-cloud/2017-08/msg00000.html Is it possible that it hit one of your other mail filters, due to being cross-posted to openstack-{dev,operators}? -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
On 2017-08-30 15:04, Adam Spiers wrote:
Oh, I see. That email did reach my inbox, but I had forgotten about it. It definitely reached the list too:
https://lists.opensuse.org/opensuse-cloud/2017-08/msg00000.html
Is it possible that it hit one of your other mail filters, due to being cross-posted to openstack-{dev,operators}?
Actually, I had the same problem, because for me it was sorted into another IMAP folder, because it was directed to some other ML I was subscribed to. Mail was To: opensuse-cloud@opensuse.org, openstack-dev@lists.openstack.org, openstack-operators@lists.openstack.org -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
Is it possible that it hit one of your other mail filters, due to being cross-posted to openstack-{dev,operators}?
You were right, I found the email in another folder that is handled by a filter for another mailing list. So there is an explanation ;-) Thanks for the hint! Zitat von Adam Spiers <aspiers@suse.com>:
Eugen Block <eblock@nde.ag> wrote:
Hi Adam,
I received my own email yesterday, so I guess there's no need to do anything. ;-)
No, I don't think re-subscribing would change anything. Was there a specific notification you were expecting?
I was just wondering because my colleague received the email announcing the Pike packages and told me about it, but I didn't get anything although all other mails addressed to this list are coming through. It's quite mysterious, but since it seems to be fine now, I'll leave it with that. ;-)
Oh, I see. That email did reach my inbox, but I had forgotten about it. It definitely reached the list too:
https://lists.opensuse.org/opensuse-cloud/2017-08/msg00000.html
Is it possible that it hit one of your other mail filters, due to being cross-posted to openstack-{dev,operators}? -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
-- Eugen Block voice : +49-40-559 51 75 NDE Netzdesign und -entwicklung AG fax : +49-40-559 51 77 Postfach 61 03 15 D-22423 Hamburg e-mail : eblock@nde.ag Vorsitzende des Aufsichtsrates: Angelika Mozdzen Sitz und Registergericht: Hamburg, HRB 90934 Vorstand: Jens-U. Mozdzen USt-IdNr. DE 814 013 983 -- To unsubscribe, e-mail: opensuse-cloud+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-cloud+owner@opensuse.org
participants (8)
-
Adam Spiers
-
Bernhard M. Wiedemann
-
Bernhard M. Wiedemann
-
Eugen Block
-
Jens-U. Mozdzen
-
Jeroen Groenewegen van der Weyden
-
Jim Fehlig
-
Thomas Bechtold