Mailinglist Archive: opensuse-bugs (6095 mails)

< Previous Next >
[Bug 1075876] Kernel Oops, null pointer dereference in timecounter_read
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Tue, 24 Apr 2018 08:24:37 +0000
  • Message-id: <bug-1075876-21960-S8W5QI11fC@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1075876
http://bugzilla.suse.com/show_bug.cgi?id=1075876#c15

Benjamin Poirier <bpoirier@xxxxxxxx> changed:

What |Removed |Added
----------------------------------------------------------------------------
Assignee|kernel-maintainers@xxxxxxxx |bpoirier@xxxxxxxx
|ovo.novell.com |

--- Comment #15 from Benjamin Poirier <bpoirier@xxxxxxxx> ---
Created attachment 768067
--> http://bugzilla.suse.com/attachment.cgi?id=768067&action=edit
e1000e: Log and workaround e1000e_get_base_timinca failures on spt macs

Thanks again for the quick feedback.

(In reply to Achim Mildenberger from comment #14)
Created attachment 768018 [details]
Test with kernel 4.4.127-4.g662314f-default

That's very interesting. I thought the problem would occur only at boot but it
seems it can happen any time TSYNCRXCTL is read.


Dear Benjamin, thanks a lot! Seems to work.

Since I'd like to understand a bit more, may I ask some questions?

Why does it sometimes need a second attempt to get a response from
"er32(TSYNCRXCTL) & E1000_TSYNCRXCTL_SYSCFI" in netdev.c?
Your test would try up to ten times but up to now I only observed failed
first attempts.

The intel 82574 datasheet (ยง10.2.9.1) contains information about the
TSYNCRXCTL register but bit 5 (E1000_TSYNCRXCTL_SYSCFI) is simply marked
"Reserved". From comments removed in commit 83129b37ef35 ("e1000e: fix systim
issues"), SYSCFI stands for "System Clock Frequency Indication". From the
comment "Stable 24MHz frequency" I assumed that it reads 1 once the clock has
had enough time to initialize and be stable but I'm not so sure anymore after
seeing the log in comment 14. I don't know why it works sometimes but not
others, 10x100ms tries just seemed like reasonable values to try.

I don't know if the behavior we observed is intentional or not. I will raise
the question on the intel-wired-lan mailing list but first, ...


Does this problem only affect kernel 4.4 and is not happening in newer
kernels? The code in file netdev.c, function e1000e_get_base_timinca looks
unchanged for me when accessing the adapter variant "spt".

... I wonder about the same thing. I also looked at the code paths and came to
the same conclusion.

It seems like the following could be reports of the same problem on more
recent kernels:
https://bugzilla.redhat.com/show_bug.cgi?id=1463882
https://bugzilla.redhat.com/show_bug.cgi?id=1431863

If you don't mind testing another kernel, I've prepared a kernel package in
project home:benjamin_poirier:bsc1075876, based on the "master" openSUSE
branch (v4.16) with a similar debugging patch. Could you please try it and
report if you see the "No SYSCFI in TSYNCRXCTL" message with this kernel?
https://download.opensuse.org/repositories/home:/benjamin_poirier:/bsc1075876/standard

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >