Bug ID 1042933
Summary kernel panic caused du do nmi caused by systemd-watchdog test
Classification openSUSE
Product openSUSE Tumbleweed
Version Current
Hardware x86-64
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-maintainers@forge.provo.novell.com
Reporter thomas.blume@suse.com
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

Created attachment 727827 [details]
reproducer code

The testsuite of systemd version 233 contains a test of the machines watchdog.
When running it on a machine with hardware watchdog the kernel crashes:

-->
teviot login: root
Password: 
Last login: Tue Jun  6 16:22:57 from 2620:113:80c0:8000:c::50a
Have a lot of fun...
teviot:~ # 
teviot:~ # cd /systemd-testsuite/run
teviot:/systemd-testsuite/run # ./test-watchdog 
Hardware watchdog 'HPE iLO2+ HW [  185.386548] hpwdt: Unexpected close, not
stopping watchdog!
Watchdog Timer', version 0teviot:/systemd-testsuite/run # 
teviot:/systemd-testsuite/run # 
teviot:/systemd-testsuite/run # 
teviot:/systemd-testsuite/run # [  208.152002] Kernel panic - not syncing: An
NMI occurred. Depending on your system the reason for the NMI is logged in any
one of the following resources:
[  208.152002] 1. Integrated Management Log (IML)
[  208.152002] 2. OA Syslog
[  208.152002] 3. OA Forward Progress Log
[  208.152002] 4. iLO Event Log
[  208.152002] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.11.3-1-default #1
[  208.152002] Hardware name: HP ProLiant BL465c G1  , BIOS A13 05/02/2011
[  208.152002] Call Trace:
[  208.152002]  <NMI>
[  208.152002]  dump_stack+0x5c/0x78
[  208.152002]  panic+0xd5/0x21e
[  208.152002]  nmi_panic+0x35/0x40
[  208.152002]  hpwdt_pretimeout+0x7f/0xe7 [hpwdt]
[  208.152002]  nmi_handle+0x60/0x120
[  208.152002]  unknown_nmi_error+0x16/0x80
[  208.152002]  do_nmi+0xe5/0x130
[  208.152002]  end_repeat_nmi+0x1a/0x1e
[  208.152002]  ? native_safe_halt+0x2/0x10
[  208.152002]  ? native_safe_halt+0x2/0x10
[  208.152002]  ? native_safe_halt+0x2/0x10
[  208.152002]  </NMI>
[  208.152002]  ? default_idle+0x1a/0x100
[  208.152002]  ? do_idle+0x161/0x1f0
[  208.152002]  ? cpu_startup_entry+0x5d/0x60
[  208.152002]  ? start_kernel+0x436/0x43e
[  208.152002]  ? early_idt_handler_array+0x120/0x120
[  208.152002]  ? x86_64_start_kernel+0x127/0x136
[  208.152002]  ? start_cpu+0x14/0x14
[  208.152002] Kernel Offset: 0x3a000000 from 0xffffffff81000000 (relocation
range: 0xffffffff80000000-0xffffffffbfffffff)
--<

Unfortunatel, I couldn't find any more information about the NMI in the IML or
the iLO log.
I could, however reproduce the issue with a code snippet broken out of systemd.
The question is whether this is a kernel bug or a bug in the systemd code.

Attaching the reproducer.
Could the kernel maintainers please take a look and give a statement?


You are receiving this mail because: