Comment # 14 on bug 1042933 from
(In reply to Borislav Petkov from comment #13)

> Do you see anything in dmesg from the watchdog while the test runs, some
> failure messages or so?

Unfortunately dmesg doesn't show any hint about the error.

> If not, you could simply go and add pr_err() calls to
> drivers/watchdog/hpwdt.c, more specifically hpwdt_ping() and dump
> the reload variable there, hpwdt_change_timer() and a couple more
> interesting.

I did the following tweak:

-->
@@ -452,18 +452,19 @@
 static void hpwdt_ping(void)
 {
     iowrite16(reload, hpwdt_timer_reg);
+    pr_err("tblume: reload variable is: %d", reload);
 }

 static int hpwdt_change_timer(int new_margin)
 {
     if (new_margin < 1 || new_margin > HPWDT_MAX_TIMER) {
-        pr_warn("New value passed in is invalid: %d seconds\n",
+        pr_err("tblume: New value passed in is invalid: %d seconds\n",
             new_margin);
         return -EINVAL;
     }

     soft_margin = new_margin;
-    pr_debug("New timer passed in is %d seconds\n", new_margin);
+    pr_err("tblume: New timer passed in is %d seconds\n", new_margin);
     reload = SECS_TO_TICKS(soft_margin);

     return 0;
@@ -495,6 +496,9 @@
     if (allow_kdump)
         hpwdt_stop();

+    //tblume: suppress NMI
+    return NMI_HANDLED;
+
     if (!is_icru && !is_uefi) {
         if (cmn_regs.u1.ral == 0) {
             nmi_panic(regs, "An NMI occurred, but unable to determine
source.\n");
--<

and got this result:

-->
# strace -r -f -o /tmp/strace-test-watchdog ./test-watchdog
Hardware watchdog 'HPE iLO2+ HW Watchdog Timer', version 0
[58494.591725] hpwdt: tblume: reload variable is: 234
[58494.592304] hpwdt: tblume: New timer passed in is 10 seconds
Set hardware watchdog to 10s.
[58494.594422] hpwdt: tblume: reload variable is: 78
[58494.594741] hpwdt: tblume: reload variable is: 78
Pinging...
[58494.595855] hpwdt: tblume: reload variable is: 78
Pinging...
[58494.597223] hpwdt: tblume: reload variable is: 78
Pinging...
[58499.598933] hpwdt: tblume: reload variable is: 78
Pinging...
[58504.600657] hpwdt: tblume: reload variable is: 78
[58509.602239] hpwdt: tblume: reload variable is: 78
[58509.604433] systemd-journald[376]: Sent WATCHDOG=1 notification.
Pinging...
[58514.603684] hpwdt: tblume: reload variable is: 78
teviot:/systemd-testsuite/run # [58519.604118] hpwdt: tblume: reload variable
is: 78
[58624.200498] systemd-journald[376]: Sent WATCHDOG=1 notification.
--<

Can you make any sense ouf of this or do I need to add more debugging?


You are receiving this mail because: