Mailinglist Archive: opensuse-bugs (6095 mails)

< Previous Next >
[Bug 1075876] Kernel Oops, null pointer dereference in timecounter_read
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Tue, 17 Apr 2018 05:41:57 +0000
  • Message-id: <bug-1075876-21960-sH138YtxBK@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1075876
http://bugzilla.suse.com/show_bug.cgi?id=1075876#c8

--- Comment #8 from Benjamin Poirier <bpoirier@xxxxxxxx> ---
I also started to look at this case. The problem is that tc->cc is null.

We have the following disasembly:
Upon entering timecounter_read()
rdi contains struct timecounter *tc

50 cycle_now = tc->cc->read(tc->cc);
0xffffffff81105096 <+6>: mov (%rdi),%rax
offsetof(struct timecounter, cc) = 0
rax contains tc->cc, 0

61
62 return ns_offset;
63 }
64
65 u64 timecounter_read(struct timecounter *tc)
66 {
0xffffffff81105099 <+9>: mov %rdi,%rbx
We get the saved value of rdi (tc) in rbx, ffff88025476b7a0

50 cycle_now = tc->cc->read(tc->cc);
0xffffffff8110509c <+12>: mov %rax,%rdi
0xffffffff8110509f <+15>: callq *(%rax)
offsetof(struct cyclecounter, read) = 0
This is the null deref (deref of cc), tc->cc->read
We already know that cc is null.

The crash occurs after 4 hours because of:

\ e1000e_ptp_init
INIT_DELAYED_WORK
schedule_delayed_work(..., 4hours)

tc should be initialized by:

\ e1000_probe
\ e1000e_reset
\ e1000e_systim_reset
timecounter_init(&adapter->tc, &adapter->cc,
ktime_to_ns(ktime_get_real()));

This kernel includes a backport of
aa524b66c5ef e1000e: don't modify SYSTIM registers during SIOCSHWTSTAMP ioctl
(v4.7-rc1)
This was backported for SLE12SP3.

From this commit we can see that e1000e_systim_reset() may exit before
timecounter_init() if ret_val. Indeed, in the log from comment 4 we see
Feb 04 15:49:22 fphnbam3 kernel: e1000e 0000:00:1f.6: Failed to restore TIMINCA
clock rate delta: -22
which explains that e1000e_systim_reset() exited early and tc is not
initialized.

I'm not sure where the -EINVAL comes from, how come it only occurs sometimes
(according to the launchpad report linked to in comment 4) and why it seems
this only affects 4.4 stable kernels (I didn't find reports on other
versions).

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >