Bug ID | 1202541 |
---|---|
Summary | KVM/libvirt bug on download.o.o VM host? |
Classification | openSUSE |
Product | openSUSE Tumbleweed |
Version | Current |
Hardware | Other |
OS | All |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | KVM |
Assignee | kvm-bugs@suse.de |
Reporter | bwiedemann@suse.com |
QA Contact | qa-bugs@suse.de |
Found By | Development |
Blocker | --- |
Created attachment 860917 [details]
pontifex messages
Tonight we had some hours of outage of download.opensuse.org
While the last messages were about OOM, a closer look made me think
that some bug in our live-migration or KVM layer might be to blame.
download.o.o runs as a VM "pontifex2" on a KVM cluster with a shared FC
block-storage.
For Thursday maintenance we have scripts to live-migrate all VMs to an empty
host before we upgrade+reboot the host.
Yesterday, atreju6 was rebooted at 09:21 UTC
Then it took some minutes for the next host to be evacuated...
/var/log/libvirt/qemu/pontifex2.log shows
2022-08-18 09:38:40.092+0000: starting up libvirt version: 5.1.0, qemu version:
3.1.1SUSE Linux Enterprise 12, kernel: 4.12.14-122.130-default
2022-08-18 09:38:40.092+0000: Domain id=27 is tainted: host-cpu
The pontifex2 log has
2022-08-18T09:40:59 general protection fault, probably for non-canonical
address 0x17ffffc0000010: 0000 [#1] PREEMPT SMP NOPTI
and from there it kept throwing 292 backtraces until it paniced tonight.