Comment # 9 on bug 1172541 from
@miroslav: During the summer I tried to investigate but the problem is,
the thing won't fail reliably, which makes bisection very hard.  And
then USA politics went into underdrive, which took a lot of time and
focus away from "important" work like the kernel bug.

I did a test last night (2020-10-21 AM).  First with
kernel-default-5.8.14-1.2, but unbound wouldn't start, see bug 1177715.
I reverted to 5.7.9-1-default (on the VM host), started the guest, gave
it some minimal work (other host logs in with X-Windows propagation,
guest pings various hosts and shows a graph of the results with
X-Windows), and let it run.  Every 3 hours both hosts do some monitoring
and auditing tasks which fork off up to 10 tests in parallel, 50-60
tests overall.  I've noticed that the host is most likely to crash
during these tests, but my attempts to point the finger of blame at a
specific test were fruitless.  The tests at midnight went OK (no
complaints, no crash), but 03:04 was the last message in syslog on both
machines, and the host crashed (no watchdog).  So kernel 5.7.9 doesn't
have a fix.

It's interesting that neither the host nor the guest's syslog has a
block of \0's after the last message, while back in June they did.  That
means that then and now the log file's inode is getting written showing
how much stuff had been sent to the kernel, but formerly the data was
not yet synced to disc, while now it does get synced more promptly.
rsyslogd has an option to sync after every write, which I didn't turn
on, and perhaps I should.  [Done.]

@joerg: None of my machines has a serial port.  I could try buying a
pair of USB serial dongles, with a wire between.  If I get this working
I'll post the result.  USB is useless while booting, but if the machine
is fully functional until the crash, the USB dongle should send at least
something.

I upgraded the guest's packages, specifically libvirt, qemu-x86 and
numerous friends, to get recent important-seeming bugfixes that had not
been installed on the mothballed guest due to political distractions. 
I'll try the test again.  Wish me luck, I'm going to need it.


You are receiving this mail because: