Mailinglist Archive: opensuse-bugs (4655 mails)

< Previous Next >
[Bug 1042933] kernel panic due to nmi caused by systemd-watchdog test
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Wed, 21 Jun 2017 11:29:55 +0000
  • Message-id: <bug-1042933-21960-I1BHUiPFqo@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1042933
http://bugzilla.suse.com/show_bug.cgi?id=1042933#c13

Borislav Petkov <bpetkov@xxxxxxxx> changed:

What |Removed |Added
----------------------------------------------------------------------------
Flags|needinfo?(bpetkov@xxxxxxxx) |

--- Comment #13 from Borislav Petkov <bpetkov@xxxxxxxx> ---
(In reply to Thomas Blume from comment #12)
Boris, I'm wondering when exactly the watchdog timer starts counting down.
Is it when opening /dev/watchdog or when doing the ping (ioctl(watchdog_fd,
WDIOC_KEEPALIVE, 0) or something else?

When you do WDIOC_SETTIMEOUT. When the kernel gets the timeout correctly
from userspace, it does the ping which goes and reprograms the timer.

It also reprograms the timer when you do watchdog_ping()

[ That thing does WDIOC_KEEPALIVE which does the reprograming too. ]

and since nothing changes the timeout, it should simply "extend" the
timeout to the 10s interval and thus not fire during the test usleeps
for t/2 seconds.

Do you see anything in dmesg from the watchdog while the test runs, some
failure messages or so?

If not, you could simply go and add pr_err() calls to
drivers/watchdog/hpwdt.c, more specifically hpwdt_ping() and dump
the reload variable there, hpwdt_change_timer() and a couple more
interesting.

And since it is a module, you don't need to reboot the machine - simply
rmmod/insmod it.

And in order to avoid the panicking, change hpwdt_pretimeout() to do:

if (allow_kdump)
hpwdt_stop();

return NMI_HANDLED;

so that you don't panic the box.

This way you'll have a better idea what is happening.

HTH.

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >
References