[Bug 254316] New: 32bit ia32 SUSE products do not boot on systems with more than 62GB of memory
https://bugzilla.novell.com/show_bug.cgi?id=254316
Summary: 32bit ia32 SUSE products do not boot on systems with
more than 62GB of memory
Product: openSUSE 10.2
Version: Final
Platform: i686
OS/Version: Other
Status: NEW
Severity: Normal
Priority: P5 - None
Component: Kernel
AssignedTo: kernel-maintainers@forge.provo.novell.com
ReportedBy: vandrove@vc.cvut.cz
QAContact: qa@suse.de
Hello,
it is not possible to boot ia32 SUSE products (SLES10/SLED10/openSUSE10.x) on
system with more than approx. 62GB of memory. It runs out of the memory in
normal zone, and crashes:
<5>Linux version 2.6.16.37-0.18-bigsmp (geeko@buildhost) (gcc version 4.1.2
20070115 (prerelease) (SUSE Linux)) #1 SMP Tue Feb 6 22:46:13 UTC 2007
<6>BIOS-provided physical RAM map:
<4> BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
<4> BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
<4> BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved)
<4> BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved)
<4> BIOS-e820: 0000000000100000 - 00000000efef0000 (usable)
<4> BIOS-e820: 00000000efef0000 - 00000000efeff000 (ACPI data)
<4> BIOS-e820: 00000000efeff000 - 00000000eff00000 (ACPI NVS)
<4> BIOS-e820: 00000000eff00000 - 00000000f0000000 (usable)
<4> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
<4> BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
<4> BIOS-e820: 00000000fffe0000 - 0000000100000000 (reserved)
<4> BIOS-e820: 0000000100000000 - 0000000fffc00000 (usable)
<5>64636MB HIGHMEM available.
<5>895MB LOWMEM available.
<6>found SMP MP-table at 000f6c90
<4>NX (Execute Disable) protection: active
<7>On node 0 totalpages: 16776192
<1>bootmem alloc of 536838176 bytes failed!
<4>Badness in __alloc_bootmem at mm/bootmem.c:406
<4> [<c0392cf1>] __alloc_bootmem+0x35/0x41
<4> [<c0392e91>] free_area_init_node+0x8c/0x40a
<4> [<c0390753>] page_table_range_init+0x50/0x80
<4> [<c0393221>] free_area_init+0x12/0x15
<4> [<c03888c0>] zone_sizes_init+0x43/0x48
<4> [<c0389467>] setup_arch+0xba2/0xcd7
<4> [<c03823bf>] start_kernel+0x3c/0x2d2
<0>Kernel panic - not syncing: Out of memory
<4> Badness in smp_call_function at arch/i386/kernel/smp.c:595
<4> [<c0112a32>] smp_call_function+0x52/0xbe
<4> [<c0122a64>] printk+0x14/0x18
<4> [<c0112ab1>] smp_send_stop+0x13/0x1c
<4> [<c0122160>] panic+0x4c/0xe4
<4> [<c0392cfb>] __alloc_bootmem+0x3f/0x41
<4> [<c0392e91>] free_area_init_node+0x8c/0x40a
<4> [<c0390753>] page_table_range_init+0x50/0x80
<4> [<c0393221>] free_area_init+0x12/0x15
<4> [<c03888c0>] zone_sizes_init+0x43/0x48
<4> [<c0389467>] setup_arch+0xba2/0xcd7
<4> [<c03823bf>] start_kernel+0x3c/0x2d2
Problem is same as we've filled against Ubuntu
(https://launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/87278):
With default initrd size problem starts occuring somewhere between 62.5 and
63GB of memory in the box. Problem is that by default 32bit kernel reports that
initrd image should be loaded just below 512MB boundary. This splits 896MB
lowmem block into two pieces - one from ~8MB to ~500MB (our tests show that
there is something around 502MB of memory here) and one from ~512 to 896MB
(384MB large). And unfortunately when there is more than 63GB of memory then
pages array grows beyond 502MB, memory allocation fails, and system panics and
then triplefaults as panic occured too early.
Our (VMware) tests confirm that same workaround as for Ubuntu works for your
products as well - changing grub's config to do:
uppermem 40000
kernel --no-mem-option
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #1 from andrea@novell.com 2007-03-14 13:18 MST ------- The kernel has no control of where the initrd will be loaded, the kernel finds the initrd in one random place when it boots. However the kernel is the one that will know best the end of the normal zone, so relocating the initrd may be good practice for absolute best defragmentation of the physical space at boot. That would workaround the grub bug too. But the grub side is clearly the one to blame and for example the mmap_length of the multiboot_info is only an unsigned long, when the e820 map sizes can be larger than 4G... so perhaps I wonder if the 500m number comes from an overflow inside grub memory probe? (kernel e820 map uses long long for length following the example) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #2 from vandrove@vc.cvut.cz 2007-03-14 18:11 MST ------- Hello Andrea, (un)fortunately kernel has control over where initrd is loaded - in arch/i386/boot/setup.S there is setupseg structure, and since header version 2.03 it contains ramdisk_max field, which is set to (-__PAGE_OFFSET-(512 << 20)-1) & 0x7FFFFFFF. Which for default PAGE_OFFSET of 0xC0000000 gives: (-3G - 512MB - 1) mod 2G = 1G - 512M - 1 = 512M - 1... And grub tries to load kernel as high as possible - so it is put directly below 512MB, in the middle of normal zone :-( I've also just figured out that if you boot kernel with vmalloc area over 512MB then it won't find initrd anymore as it expects initrd only in normal zone. So I was thinking that setting ramdisk_max to 128MB would be good idea, but then I realized that it would break >120MB initrds. So it seems that this bug has to be reassigned to grub to change its heuristic from 'load initrd to as high as possible' to something smarter, like 'load initrd at 64MB as log as it fits there, and if it won't fit then load it as high as possible'. BTW, you may be interested that when 'uppermem 40000' is passed to the grub then your 32bit products actually do work on systems with 63.996GB (65532MB) (I've found that 48GB+ configurations are not supported at all only after filling this bug, sorry about that...) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #3 from andrea@novell.com 2007-03-14 18:48 MST ------- Good point, I didn't notice this ramdisk_max parameter in the 2.03 header. grub is using it. moveto = (moveto - len) & 0xfffff000; max_addr = (lh->header == LINUX_MAGIC_SIGNATURE && lh->version >= 0x0203 ? lh->initrd_addr_max : LINUX_INITRD_MAX_ADDRESS); if (moveto + len >= max_addr) moveto = (max_addr - len) & 0xfffff000; So ok, kernel is certainly partly to blame for this 512M thing. But I don't think the right solution is to move the initrd lower. If you move it down, either by lowering ramdisk_max or by making grub smarter, you risk fragmenting memory during the bootmem allocator allocations that may leave partial pages unused. so it would waste a bit of memory, not a big deal but if we're going to change something, we better do it optimal. Loading initrd at the end of the normal zone sounds right. So I think we either have to raise ramddisk_max to 800M or to relocate the initrd manually before initializing the bootmem allocator (the latter sounds a bit safer as it won't clash so easily if you enlarge the vmalloc space). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316
Nick Piggin
https://bugzilla.novell.com/show_bug.cgi?id=254316
Lars Marowsky-Bree
https://bugzilla.novell.com/show_bug.cgi?id=254316
User jslaby@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c6
--- Comment #6 from Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=254316
Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=254316
User vandrove@vc.cvut.cz added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c7
--- Comment #7 from Petr Vandrovec
https://bugzilla.novell.com/show_bug.cgi?id=254316
User vandrove@vc.cvut.cz added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c8
--- Comment #8 from Petr Vandrovec
https://bugzilla.novell.com/show_bug.cgi?id=254316
User vandrove@vc.cvut.cz added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c9
--- Comment #9 from Petr Vandrovec
https://bugzilla.novell.com/show_bug.cgi?id=254316
User vandrove@vc.cvut.cz added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c10
--- Comment #10 from Petr Vandrovec
https://bugzilla.novell.com/show_bug.cgi?id=254316
User vandrove@vc.cvut.cz added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c11
Petr Vandrovec
https://bugzilla.novell.com/show_bug.cgi?id=254316
User jslaby@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c12
--- Comment #12 from Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=254316
User jslaby@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c13
--- Comment #13 from Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=254316
Jiri Kosina
https://bugzilla.novell.com/show_bug.cgi?id=254316
User jslaby@novell.com added comment
https://bugzilla.novell.com/show_bug.cgi?id=254316#c16
Jiri Slaby
https://bugzilla.novell.com/show_bug.cgi?id=254316
Jiri Slaby
participants (1)
-
bugzilla_noreply@novell.com