[Bug 757888] New: numa_register_memblks.constprop.7+0x54/0x13c()
https://bugzilla.novell.com/show_bug.cgi?id=757888 https://bugzilla.novell.com/show_bug.cgi?id=757888#c0 Summary: numa_register_memblks.constprop.7+0x54/0x13c() Classification: openSUSE Product: openSUSE 12.1 Version: Final Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: per@opensuse.org QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:10.0.2) Gecko/20100101 Firefox/10.0.2 Hardware: ML570G3, 32Gb RAM, 2xSmartarray, dual GigE NICs. On a newly installed system with 12.1+updates, I spotted the following in the dmesg output: [ 0.000000] ------------[ cut here ]------------ [ 0.000000] WARNING: at /home/abuild/rpmbuild/BUILD/kernel-default-3.1.9/linux-3.1/arch/x86/mm/numa.c:505 numa_register_memblks.constprop.7+0x54/0x13c() [ 0.000000] Hardware name: ProLiant ML570 G3 [ 0.000000] Modules linked in: [ 0.000000] Pid: 0, comm: swapper Not tainted 3.1.9-1.4-default #1 [ 0.000000] Call Trace: [ 0.000000] [<ffffffff810042fa>] dump_trace+0x9a/0x270 [ 0.000000] [<ffffffff81525724>] dump_stack+0x69/0x6f [ 0.000000] [<ffffffff81051f5b>] warn_slowpath_common+0x7b/0xc0 [ 0.000000] [<ffffffff81b5b7eb>] numa_register_memblks.constprop.7+0x54/0x13c [ 0.000000] [<ffffffff81b5bc47>] numa_init.part.6+0x31/0x10a [ 0.000000] [<ffffffff81b5beaa>] x86_numa_init+0x16/0x39 [ 0.000000] [<ffffffff81b4b4c3>] setup_arch+0x642/0x724 [ 0.000000] [<ffffffff81b488a0>] start_kernel+0x91/0x39d [ 0.000000] [<ffffffff81b48434>] x86_64_start_kernel+0xd1/0xe0 [ 0.000000] ---[ end trace 4eaa2a86a8e2da22 ]--- [ 0.000000] No NUMA configuration found It doesn't seem to affect operation of the system though. Reproducible: Always -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c1
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c2
Per Jessen
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c3
Jeff Mahoney
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c4
Rafael Wysocki
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c5
Per Jessen
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c
Rafael Wysocki
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c6
--- Comment #6 from Rafael Wysocki
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c
Rafael Wysocki
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c
Rafael Wysocki
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c7
Thomas Renninger
No NUMA configuration found is correct.
It doesn't seem to affect operation of the system though. Yes, everything is fine.
The machine still has an SRAT table defined with one entry in it: All memory: Base Address : 0000000000000000 Address Length : 0000000800000000 is assigned to NUMA domain: Proximity Domain : 00000001 In ACPI specification (SRAT table definitions): 5.2.15.2 Memory Affinity Structure The Proximity Domain field is defined as: Integer that represents the proximity domain to which the processor belongs But there are no LAPIC/processor mappings to NUMA/proximity domains defined. -> So this can be interpreted as BIOS bug already. I think the Linux kernel should still handle this more gracefully. I wonder why they provide the SRAT table at all, it has zero info. Even it is named: Oem Table ID : "HOT ADD ", it has the hotadd flag for the memory set to zero. Especially because the "enabled" bit of the only provided memory structure is not set: Enabled : 0 the whole table and all its entries should be ignored. Ok, I guess I found a fix. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c8
Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c9
--- Comment #9 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c10
--- Comment #10 from Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c12
Per Jessen
Some questions: I expect this machine (ProLiant ML570 G3) is not SLE11 certified?
SLE11 is not listed, but the HP support website does list SLE10 and SLE9.
I guess HP does not support Linux on this platform at all, otherwise this message would have been noticed (should have shown up for all recent kernels, I expect)?
The HP support website lists both SLES and RHEL.
Why did you choose this one, HP offers quite some platforms with SUSE/Linux support?
It was an unused server in the store-room :-) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c13
Thomas Renninger
HP support website does list SLE10 and SLE9. Wow, is the machine that old... Hm, iirc the BIOS is from 2008.
Could you give the kernel a quick test and check whether the warning message is gone, please (see comment #10): I double checked and kernels built from 29-Jul-2012 13:03 or newer do have the fix. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c14
--- Comment #14 from Per Jessen
HP support website does list SLE10 and SLE9. Wow, is the machine that old... Hm, iirc the BIOS is from 2008.
Right, the latest is P37.
Could you give the kernel a quick test and check whether the warning message is gone, please (see comment #10): I double checked and kernels built from 29-Jul-2012 13:03 or newer do have the fix.
Yep, I'm working on it, just updating the PXE config now. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c15
Per Jessen
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c16
--- Comment #16 from Thomas Renninger
... the NUMA warning is gone. Thanks! I am going to send the patches upstream and will close this bug as soon as they got accepted.
-- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c17
--- Comment #17 from Bernhard Wiedemann
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c18
Thomas Renninger
https://bugzilla.novell.com/show_bug.cgi?id=757888
https://bugzilla.novell.com/show_bug.cgi?id=757888#c19
--- Comment #19 from Len Brown
participants (1)
-
bugzilla_noreply@novell.com