[Bug 967862] New: a xen dom0 appears to halt several times during boot-up or shutdown; hitting enter gets it going again.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862 Bug ID: 967862 Summary: a xen dom0 appears to halt several times during boot-up or shutdown; hitting enter gets it going again. Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.1 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Xen Assignee: xen-bugs@suse.de Reporter: per@computer.org QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- This is a branch of bug#966449. I'm copying in the pertinent text. ------------------ I have also been testing leap with xen - two platforms: a) HP Proliant DL360G5, quad-core, 8Gb, dual-GigE, single SAS drive. b) Fujitsu Desktop, dual-core, 4Gb, GigE, single SATA drive. On (a) it went fine, no problems whatsoever. On (b) I did eventually manage to get an installation to work, but it's essentially unusable. During both startup and shutdown, it hangs several times causing both to take forever. Only man ual intervention (pressing enter) at the console gets the process to continue. I have no idea what is happening, but have seen messages such as : rcu_sched kthread starved for N jiffies! rcu_sched self-detected stall on CPU ... ------------------- (In reply to Charles Arnold from comment #9)
Please download an install the following RPMs for your os42.1 host:
xen-4.5.2_06-5.1.x86_64.rpm xen-libs-4.5.2_06-5.1.x86_64.rpm xen-tools-4.5.2_06-5.1.x86_64.rpm
from this location,
http://download.opensuse.org/repositories/Virtualization:/openSUSE42.1/ openSUSE_Leap_42.1/x86_64/
Once installed set the following hypervisor boot flag.
Edit /etc/default/grub and add, GRUB_CMDLINE_XEN_DEFAULT="iommu=no-igfx" and then run, grub2-mkconfig -o /boot/grub2/grub.cfg
Reboot the host and select the xen menu entry. Please report your results.
On my test system (Fujitsu desktop), this was a significant improvement. During boot-up, I only needed to press enter once or twice, but during shutdown I still needed to hit enter several times to complete. ------------------------- (In reply to Charles Arnold from comment #12)
(In reply to Per Jessen from comment #11)
(In reply to Per Jessen from comment #10)
On my test system (Fujitsu desktop), this was a significant improvement. During boot-up, I only needed to press enter once or twice, and once during shutdown.
Uh, correction - it did require hitting enter several times to complete a shutdown/reboot.
Could you explain what you are seeing that requires hitting the Enter key several times?
"requires" is perhaps the wrong word. I'm at the physical box with a console, and during boot-up or shutdown, it appears to just halt every so often. Hitting enter makes it continue. Then it halts, I hit enter etc. Without hitting enter, boot-up or shutdown never finishes. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c1
--- Comment #1 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c2
--- Comment #2 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c3
--- Comment #3 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c4
--- Comment #4 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c5
--- Comment #5 from Per Jessen
I have now observed the same behaviour (ie. halts that can be resumed by pressing a key) when running e.g. yast or mkinitrd.
Even just pressing Ctrl-G - the bell keeps going until I release Ctrl. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c6
--- Comment #6 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c8
--- Comment #8 from Per Jessen
If this is with kernel-xen, please try suppressing use of C-states in the hypervisor. If this is with kernel-default (i.e. pv-ops), please try switching to kernel-xen.
This is in Dom0. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c9
Mike Latimer
(In reply to Jan Beulich from comment #7)
If this is with kernel-xen, please try suppressing use of C-states in the hypervisor. If this is with kernel-default (i.e. pv-ops), please try switching to kernel-xen.
This is in Dom0.
With the move to a pvops enabled -default kernel (in Tumbleweed), the fact that this is dom0 doesn't really answer the question. Can you provide the output of `uname -a` to confirm you are running kernel-xen? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c10
James Fehlig
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c11
--- Comment #11 from Per Jessen
(In reply to Per Jessen from comment #8)
(In reply to Jan Beulich from comment #7)
If this is with kernel-xen, please try suppressing use of C-states in the hypervisor. If this is with kernel-default (i.e. pv-ops), please try switching to kernel-xen.
This is in Dom0.
With the move to a pvops enabled -default kernel (in Tumbleweed), the fact that this is dom0 doesn't really answer the question. Can you provide the output of `uname -a` to confirm you are running kernel-xen?
FYI, this is Leap 42.1, not TW. uname -a Linux guest57 4.1.15-8-xen #1 SMP Wed Jan 20 16:41:00 UTC 2016 (0e3b3ab) x86_64 x86_64 x86_64 GNU/Linux I have now booted once with max_cstate=3, I still had at least one halt. I'll do a few more boot-ups and shutdowns. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c12
Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c13
--- Comment #13 from Per Jessen
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c15
Per Jessen
Please try "max_state=1" then, and if that still doesn't help "no-cpuidle".
"max_cstate=1" did the trick, the system is now much snappier, no pressing keys needed. Is this some BIOS option that I need to change, Jan? I'm using the defaults for this system, but I see settings such as Enhanced SpeedStep y/n Enhanced Idle Power State y/n Both default to zes. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c17
--- Comment #17 from Per Jessen
(In reply to Per Jessen from comment #15)
"max_cstate=1" did the trick, the system is now much snappier, no pressing keys needed. Is this some BIOS option that I need to change, Jan? I'm using the defaults for this system, but I see settings such as
Enhanced SpeedStep y/n Enhanced Idle Power State y/n
Maybe worth trying (more likely for the latter than the former), but you never know what exactly hides behind those.
In the end right now we only know there is an issue (perhaps interrupt signals getting lost) _somewhere_. Could be due to hardware, firmware, or software. So far no hypervisor log has been attached here, so I can't even judge which of the C-state drivers is being used. If it's the MWAIT one, suppressing its use and using the ACPI one would be another thing to try. If the latter works, precise CPU model information would be needed, along with indication whether the intel_idle driver in Linux doesn't cause such hick-ups.
I'll try disabling "enhanced speedstep" and see what happens. Exact CPU-info: guest57:~ # lscpu Architecture: x86_64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 1 Socket(s): 2 Vendor ID: GenuineIntel CPU family: 6 Model: 23 Model name: Intel(R) Core(TM)2 Duo CPU E8400 @ 3.00GHz Stepping: 10 CPU MHz: 1999.000 BogoMIPS: 5991.77 Hypervisor vendor: Xen Virtualization type: none As for the rest, it is a vanilla Leap 42.1 installation, so whatever defaults apply wrt C-state drivers. I'll be happy to provide whatever info you need, but OTOH this is just a test system, I don't really care that much if it's able to run xen. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
http://bugzilla.opensuse.org/show_bug.cgi?id=967862#c18
Per Jessen
I'll try disabling "enhanced speedstep" and see what happens.
Made no difference.
I'll be happy to provide whatever info you need, but OTOH this is just a test system, I don't really care that much if it's able to run xen.
Jan, unless this issue is of general interest, I suggest we close this. I was only helping jdd getting xen to run, I don't care if this specific hardware has a problem. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=967862
Anton Samsonov
participants (1)
-
bugzilla_noreply@novell.com