Mailinglist Archive: opensuse-bugs (4284 mails)

< Previous Next >
[Bug 1039737] Kernel BUG at ../mm/huge_memory.c / split_huge_page
  • From: bugzilla_noreply@xxxxxxxxxx
  • Date: Tue, 23 May 2017 15:16:04 +0000
  • Message-id: <bug-1039737-21960-YO0n5ybMTx@http.bugzilla.suse.com/>
http://bugzilla.suse.com/show_bug.cgi?id=1039737
http://bugzilla.suse.com/show_bug.cgi?id=1039737#c10

--- Comment #10 from Jochen Hansper <hansper@xxxxxxxxxxx> ---
(In reply to Vlastimil Babka from comment #9)
I've looked at more detail at the report and it's triggering here in
__split_huge_page_map():

BUG_ON(!pte_none(*pte));

where pte points to a deposited page table that the huge page keeps for when
it needs to be split. Nobody should be accessing it while deposited, but
here it was clearly written to. This definitely doesn't look like a THP vs
something race that's being fixed upstream semi-regularly.

Unfortunately we can't see from the oops what was the unexpected value in
the page table, RDX points there but we don't see the contents. One
possibility is to setup kdump and produce a crash dump to inspect. Or we add
some debug printing. We could also make the deposited page read-only which
would trigger on any writes, unless it's a HW problem.

I've rebuilt the debug kernel with the following .config settings:

CONFIG_DEBUG_VM=y
CONFIG_DEBUG_VM_VMACACHE=y
CONFIG_DEBUG_VM_RB=y
CONFIG_DEBUG_VIRTUAL=y

Will this help debugging?

I've not managed to get X running, yet (debug kernel + nvidia or nouveau). I'll
take a look at that, if this approach can be useful. Unfortunately, I really
need the machine where the bug happened on a day-to-day basis, so I can't go
without X...

Before opening this bug report, I ran 24h+ memtest and 24h+ prime95 (on Ubuntu
16.10, kernel 4.8) without issues.

--
You are receiving this mail because:
You are on the CC list for the bug.
< Previous Next >