Bug ID 1202541
Summary KVM/libvirt bug on download.o.o VM host?
Classification openSUSE
Product openSUSE Tumbleweed
Version Current
Hardware Other
OS All
Status NEW
Severity Normal
Priority P5 - None
Component KVM
Assignee kvm-bugs@suse.de
Reporter bwiedemann@suse.com
QA Contact qa-bugs@suse.de
Found By Development
Blocker ---

Created attachment 860917 [details]
pontifex messages

Tonight we had some hours of outage of download.opensuse.org

While the last messages were about OOM, a closer look made me think
that some bug in our live-migration or KVM layer might be to blame.

download.o.o runs as a VM "pontifex2" on a KVM cluster with a shared FC
block-storage.
For Thursday maintenance we have scripts to live-migrate all VMs to an empty
host before we upgrade+reboot the host.

Yesterday, atreju6 was rebooted at 09:21 UTC

Then it took some minutes for the next host to be evacuated...
/var/log/libvirt/qemu/pontifex2.log shows
2022-08-18 09:38:40.092+0000: starting up libvirt version: 5.1.0, qemu version:
3.1.1SUSE Linux Enterprise 12, kernel: 4.12.14-122.130-default
2022-08-18 09:38:40.092+0000: Domain id=27 is tainted: host-cpu

The pontifex2 log has
2022-08-18T09:40:59 general protection fault, probably for non-canonical
address 0x17ffffc0000010: 0000 [#1] PREEMPT SMP NOPTI

and from there it kept throwing 292 backtraces until it paniced tonight.


You are receiving this mail because: