On Mon, 2020-02-17 at 18:39 -0800, Glen wrote:
Dear OpenSuse Team:
Hello here as well, :-)
1. Several people had the same problem, where guests randomly stall/freeze. 2. The problem seems NOT to be related to OpenSuse itself, or OpenSuse version, or Linux Kernel version. 3. The problem DOES seem to be related to Xen version, and to a specific module, the "credit-scheduler-2".
Reverting to any Xen prior to Xen 4.12 fixes the problem (thank you Olaf!) but that's suboptimal in terms of wanting to run the latest software versions (or, more to the point, the production versions that come with the Leap releases.)
With that in mind, the best fix so far seems to be to add "sched=credit" to GRUB_CMDLINE_XEN in /etc/default/grub, as in:
GRUB_CMDLINE_XEN="dom0_mem=4G dom0_max_vcpus=4 dom0_vcpus_pin gnttab_max_frames=256 sched=credit"
Right.
Members of the Xen community have suggested making sched=credit the default until problems with credit-scheduler-2 are fixed. I have no idea how that would apply to us, but felt I should mention that, as it seems important.
"Us" being? openSUSE? Well, I guess that if upstream changes the default, we'll do the same, unless there are very good reasons not to. But I don't think this is what we should focus on at this stage...
I'm now inquiring of their users list when and how to file a bug report for this, and I'll continue to try to work with them, but I wanted to get this back to this group and list in case anyone else needs this info, and/or in case anyone here has any comments or additional guidance.
You have done a good job at reporting a bug on the Xen developer mailing list. The issue managed (although after a little while, but not at all because of your fault) to catch the Xen scheduler developers' and maintainers' attention (Juergen Gross and myself :-)). So, nothing much more to say than <<Thanks! Keep up the good work of reporting nasty issues!>> :-) Speaking of that (I mean, of keeping up the good work), as asked on the upstream ML already, if/when you still have the chance to reproduce the problematic situation, when running on Credit2, we'll be very happy to see some more logs. Also, it's more than ok to continue this conversation here, but since it is an upstream issue, please, do report about any update and logs that you can capture directly upstream (i.e., on the xen-devel mailing list). Thanks and Regards -- Dario Faggioli, Ph.D http://about.me/dario.faggioli Virtualization Software Engineer SUSE Labs, SUSE https://www.suse.com/ ------------------------------------------------------------------- <<This happens because _I_ choose it to happen!>> (Raistlin Majere)