Mailinglist Archive: opensuse-bugs (4655 mails)

< Previous Next >
[Bug 1042933] kernel panic due to nmi caused by systemd-watchdog test
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Wed, 28 Jun 2017 07:38:42 +0000
  • Message-id: <bug-1042933-21960-Y0QjuUlnd3@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1042933
http://bugzilla.suse.com/show_bug.cgi?id=1042933#c18

--- Comment #18 from Thomas Blume <thomas.blume@xxxxxxxx> ---
(In reply to Borislav Petkov from comment #17)

Can you disable the watchdog before you run the test:

# echo 0 > /proc/sys/kernel/nmi_watchdog

as root.

See if the NMI gets raised still.

Yes, the NMI is still visible.

If it does, do this:

static void hpwdt_ping(void)
{
iowrite16(reload, hpwdt_timer_reg);

pr_err("%s: reload: %d, time left: %d\n", __func__, reload,
hpwdt_time_left());
}

so that we can see what *actually* gets written into the timer each time.

Sorry, that took a while since I got some compile errors.
That's fixed now and I get this:

-->
teviot:~ # modprobe -r hpwdt
teviot:~ #
teviot:~ #
teviot:~ # insmod /usr/src/linux-4.11.4-1/drivers/watchdog/hpwdt.ko
[ 69.481330] hpwdt: no symbol version for module_layout
[ 69.482380] hpwdt: loading out-of-tree module taints kernel.
[ 69.484632] hpwdt: tblume: New timer passed in is 30 seconds
[ 69.485922] hpwdt 0000:00:04.0: HPE Watchdog Timer Driver: NMI decoding
initialized, allow kernel dump: ON (default = 1/ON)
[ 69.488473] hpwdt 0000:00:04.0: HPE Watchdog Timer Driver: 1.4.0, timer
margin: 30 seconds (nowayout=0).
teviot:~ #
teviot:~ #
teviot:~ # /systemd-testsuite/run/test-watchdog
[ 78.964690] hpwdt: hpwdt_ping: reload: 234, time left: 29
Hardware watchdo[ 78.965863] hpwdt: tblume: New timer passed in is 10 seconds
g 'HPE iLO2+ HW [ 78.967012] hpwdt: hpwdt_ping: reload: 78, time left: 9
Watchdog Timer',[ 78.968197] hpwdt: hpwdt_ping: reload: 78, time left: 9
version 0
Set [ 78.969320] hpwdt: hpwdt_ping: reload: 78, time left: 9
hardware watchdo[ 78.970401] hpwdt: hpwdt_ping: reload: 78, time left: 9
g to 10s.
Pinging...
[ 80.213826] hpwdt: hpwdt_pretimeout: NMI raised
Pinging...
[ 83.971642] hpwdt: hpwdt_ping: reload: 78, time left: 9
Pinging...
[ 88.972910] hpwdt: hpwdt_ping: reload: 78, time left: 9
Pinging...
[ 93.974104] hpwdt: hpwdt_ping: reload: 78, time left: 9
Pinging...
[ 98.975338] hpwdt: hpwdt_ping: reload: 78, time left: 9
[ 98.980675] systemd-journald[367]: Sent WATCHDOG=1 notification.
[ 103.976561] hpwdt: hpwdt_ping: reload: 78, time left: 9
teviot:~ # [ 195.508137] systemd-journald[367]: Sent WATCHDOG=1 notification.
--<


Also, the third thing to try is try to reproduce on another HP box.
Maybe this one's hpwdt BIOS crap is busted (wouldn't be a stretch).


Ok, trying to find another one for a second reproducer.

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >
References