Hi list,
this week I noticed something strange in our cloud (Ocata on Leap 42.3).
We use Ceph as backend for nova, glance and cinder, everything really
works like a charm. But from time to time we've noticed that some
instances take much longer to launch than others. So I wanted to take
a look what's happening, turned on debug logs and found that in some
cases (I have no idea how to reproduce yet) there is an image
downloaded to /var/lib/nova/instances/_base which then is used to
import it back to Ceph, that is obviously the reason for the delay.
The problem is that this new instance is not CoW, it's a flat rbd
image. Here are some relevant logs (instance_id:
65567fc1-017f-45dc-b0ee-570c44146119, image_id:
0da1ba0f-c504-45ea-b138-16026aec022b)
---cut here---
[...]
2018-10-04 11:46:39.189 10293 DEBUG nova.compute.manager
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] [instance:
65567fc1-017f-45dc-b0ee-570c44146119] Start spawning the instance on
the hypervisor. _build_and_run_instance
/usr/lib/python2.7/site-packages/nova/compute/manager.py:1929
[...]
2018-10-04 11:46:39.192 10293 INFO nova.virt.libvirt.driver
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] [instance:
65567fc1-017f-45dc-b0ee-570c44146119] Creating image
2018-10-04 11:46:39.220 10293 DEBUG
nova.virt.libvirt.storage.rbd_utils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] rbd image
65567fc1-017f-45dc-b0ee-570c44146119_disk does not exist __init__
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:77
2018-10-04 11:46:39.241 10293 DEBUG
nova.virt.libvirt.storage.rbd_utils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] rbd image
65567fc1-017f-45dc-b0ee-570c44146119_disk does not exist __init__
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:77
2018-10-04 11:46:39.245 10293 DEBUG oslo_concurrency.lockutils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Lock
"3c60237fbf59101d3411c4f795d0a72b82752e0b" acquired by
"nova.virt.libvirt.imagebackend.fetch_func_sync" :: waited 0.001s
inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:270
2018-10-04 11:46:39.246 10293 DEBUG oslo_concurrency.lockutils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Lock
"3c60237fbf59101d3411c4f795d0a72b82752e0b" released by
"nova.virt.libvirt.imagebackend.fetch_func_sync" :: held 0.001s inner
/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:282
2018-10-04 11:46:39.266 10293 DEBUG
nova.virt.libvirt.storage.rbd_utils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] rbd image
65567fc1-017f-45dc-b0ee-570c44146119_disk does not exist __init__
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:77
2018-10-04 11:46:39.269 10293 DEBUG oslo_concurrency.processutils
[req-85d728c3-5da1-4b37-add7-b956b4b2bb3d
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Running cmd (subprocess): rbd
import --pool images
/var/lib/nova/instances/_base/3c60237fbf59101d3411c4f795d0a72b82752e0b
65567fc1-017f-45dc-b0ee-570c44146119_disk --image-format=2 --id
openstack --conf /etc/ceph/ceph.conf execute
/usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:355
[...]
# no parent data
control:~ # rbd info images/65567fc1-017f-45dc-b0ee-570c44146119_disk
rbd image '65567fc1-017f-45dc-b0ee-570c44146119_disk':
size 6144 MB in 1536 objects
order 22 (4096 kB objects)
block_name_prefix: rbd_data.c8f88a6b8b4567
format: 2
features: layering, exclusive-lock, object-map
flags:
create_timestamp: Thu Oct 4 11:46:39 2018
---cut here---
###################################################################
For comparison I moved the respective base file and tried to spawn a
new instance from the same glance image:
---cut here---
[...]
2018-10-04 10:30:29.412 2336 DEBUG nova.compute.manager
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] [instance:
91d0b930-97b0-4dd0-81b4-929599b7c997] Start spawning the instance on
the hypervisor. _build_and_run_instance
/usr/lib/python2.7/site-packages/nova/compute/manager.py:1929
[...]
2018-10-04 10:30:29.415 2336 INFO nova.virt.libvirt.driver
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] [instance:
91d0b930-97b0-4dd0-81b4-929599b7c997] Creating image
2018-10-04 10:30:29.435 2336 DEBUG nova.virt.libvirt.storage.rbd_utils
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] rbd image
91d0b930-97b0-4dd0-81b4-929599b7c997_disk does not exist __init__
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:77
2018-10-04 10:30:29.455 2336 DEBUG nova.virt.libvirt.storage.rbd_utils
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] rbd image
91d0b930-97b0-4dd0-81b4-929599b7c997_disk does not exist __init__
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:77
2018-10-04 10:30:29.492 2336 DEBUG oslo_concurrency.lockutils
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Lock
"3c60237fbf59101d3411c4f795d0a72b82752e0b" acquired by
"nova.virt.libvirt.imagebackend.fetch_func_sync" :: waited 0.035s
inner /usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:270
2018-10-04 10:30:29.527 2336 DEBUG nova.virt.libvirt.imagebackend
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Image locations are: [{u'url':
u'rbd://655cb05a-435a-41ba-83d9-8549f7c36167/images/0da1ba0f-c504-45ea-b138-16026aec022b/snap', u'metadata': {}}, {'url': u'rbd://655cb05a-435a-41ba-83d9-8549f7c36167/images/0da1ba0f-c504-45ea-b138-16026aec022b/snap', 'metadata': {}}] clone
/usr/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py:903
2018-10-04 10:30:29.681 2336 DEBUG nova.virt.libvirt.imagebackend
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] Selected location: {u'url':
u'rbd://655cb05a-435a-41ba-83d9-8549f7c36167/images/0da1ba0f-c504-45ea-b138-16026aec022b/snap', u'metadata': {}} clone
/usr/lib/python2.7/site-packages/nova/virt/libvirt/imagebackend.py:912
2018-10-04 10:30:29.682 2336 DEBUG nova.virt.libvirt.storage.rbd_utils
[req-942ba103-1932-4adf-b9b2-670e1a2fc126
df7b63e69da3b1ee2be3d79342e7992f3620beddbdac7768dcb738105e74301e
2e3c3f3822124a3fa9fd905164f519ae - - -] cloning
images/0da1ba0f-c504-45ea-b138-16026aec022b@snap to
None/91d0b930-97b0-4dd0-81b4-929599b7c997_disk clone
/usr/lib/python2.7/site-packages/nova/virt/libvirt/storage/rbd_utils.py:236
[...]
# instance has parent data
control:~ # rbd info images/91d0b930-97b0-4dd0-81b4-929599b7c997_disk
rbd image '91d0b930-97b0-4dd0-81b4-929599b7c997_disk':
size 8192 MB in 1024 objects
order 23 (8192 kB objects)
block_name_prefix: rbd_data.c8abcb6b8b4567
format: 2
features: layering, exclusive-lock, object-map
flags:
create_timestamp: Thu Oct 4 10:30:29 2018
parent: images/0da1ba0f-c504-45ea-b138-16026aec022b@snap
overlap: 6144 MB
---cut here---
Why is there a local copy of the image in the first place? Obviously
nova doesn't need that and creates clones successfully without them.
The directory /var/lib/nova/instances/_base only contains 6 images.
I would be thankful for any explanation for this behaviour.
Regards,
Eugen