[Bug 254316] New: 32bit ia32 SUSE products do not boot on systems with more than 62GB of memory
https://bugzilla.novell.com/show_bug.cgi?id=254316 Summary: 32bit ia32 SUSE products do not boot on systems with more than 62GB of memory Product: openSUSE 10.2 Version: Final Platform: i686 OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: vandrove@vc.cvut.cz QAContact: qa@suse.de Hello, it is not possible to boot ia32 SUSE products (SLES10/SLED10/openSUSE10.x) on system with more than approx. 62GB of memory. It runs out of the memory in normal zone, and crashes: <5>Linux version 2.6.16.37-0.18-bigsmp (geeko@buildhost) (gcc version 4.1.2 20070115 (prerelease) (SUSE Linux)) #1 SMP Tue Feb 6 22:46:13 UTC 2007 <6>BIOS-provided physical RAM map: <4> BIOS-e820: 0000000000000000 - 000000000009f800 (usable) <4> BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) <4> BIOS-e820: 00000000000ca000 - 00000000000cc000 (reserved) <4> BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved) <4> BIOS-e820: 0000000000100000 - 00000000efef0000 (usable) <4> BIOS-e820: 00000000efef0000 - 00000000efeff000 (ACPI data) <4> BIOS-e820: 00000000efeff000 - 00000000eff00000 (ACPI NVS) <4> BIOS-e820: 00000000eff00000 - 00000000f0000000 (usable) <4> BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved) <4> BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved) <4> BIOS-e820: 00000000fffe0000 - 0000000100000000 (reserved) <4> BIOS-e820: 0000000100000000 - 0000000fffc00000 (usable) <5>64636MB HIGHMEM available. <5>895MB LOWMEM available. <6>found SMP MP-table at 000f6c90 <4>NX (Execute Disable) protection: active <7>On node 0 totalpages: 16776192 <1>bootmem alloc of 536838176 bytes failed! <4>Badness in __alloc_bootmem at mm/bootmem.c:406 <4> [<c0392cf1>] __alloc_bootmem+0x35/0x41 <4> [<c0392e91>] free_area_init_node+0x8c/0x40a <4> [<c0390753>] page_table_range_init+0x50/0x80 <4> [<c0393221>] free_area_init+0x12/0x15 <4> [<c03888c0>] zone_sizes_init+0x43/0x48 <4> [<c0389467>] setup_arch+0xba2/0xcd7 <4> [<c03823bf>] start_kernel+0x3c/0x2d2 <0>Kernel panic - not syncing: Out of memory <4> Badness in smp_call_function at arch/i386/kernel/smp.c:595 <4> [<c0112a32>] smp_call_function+0x52/0xbe <4> [<c0122a64>] printk+0x14/0x18 <4> [<c0112ab1>] smp_send_stop+0x13/0x1c <4> [<c0122160>] panic+0x4c/0xe4 <4> [<c0392cfb>] __alloc_bootmem+0x3f/0x41 <4> [<c0392e91>] free_area_init_node+0x8c/0x40a <4> [<c0390753>] page_table_range_init+0x50/0x80 <4> [<c0393221>] free_area_init+0x12/0x15 <4> [<c03888c0>] zone_sizes_init+0x43/0x48 <4> [<c0389467>] setup_arch+0xba2/0xcd7 <4> [<c03823bf>] start_kernel+0x3c/0x2d2 Problem is same as we've filled against Ubuntu (https://launchpad.net/ubuntu/+source/linux-source-2.6.17/+bug/87278): With default initrd size problem starts occuring somewhere between 62.5 and 63GB of memory in the box. Problem is that by default 32bit kernel reports that initrd image should be loaded just below 512MB boundary. This splits 896MB lowmem block into two pieces - one from ~8MB to ~500MB (our tests show that there is something around 502MB of memory here) and one from ~512 to 896MB (384MB large). And unfortunately when there is more than 63GB of memory then pages array grows beyond 502MB, memory allocation fails, and system panics and then triplefaults as panic occured too early. Our (VMware) tests confirm that same workaround as for Ubuntu works for your products as well - changing grub's config to do: uppermem 40000 kernel --no-mem-option <kernel_image> <andargumentsasalreadypresent> instead of kernel <kernel_image> <andarguments...> as long as initrd is smaller than 30MB. Is there any reason why your kernel suggests recommended initrd location at 512MB? Using either 800MB or 40MB would work, as then there would be 700+MB contiguous block of physical memory for pages array. But with 512MB normal zone (832MB) is split into two pieces - one slightly below 500MB, and another ~300MB, neither of them sufficiently large to hold 512MB pages array :-( Thanks, Petr Vandrovec -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #1 from andrea@novell.com 2007-03-14 13:18 MST ------- The kernel has no control of where the initrd will be loaded, the kernel finds the initrd in one random place when it boots. However the kernel is the one that will know best the end of the normal zone, so relocating the initrd may be good practice for absolute best defragmentation of the physical space at boot. That would workaround the grub bug too. But the grub side is clearly the one to blame and for example the mmap_length of the multiboot_info is only an unsigned long, when the e820 map sizes can be larger than 4G... so perhaps I wonder if the 500m number comes from an overflow inside grub memory probe? (kernel e820 map uses long long for length following the example) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #2 from vandrove@vc.cvut.cz 2007-03-14 18:11 MST ------- Hello Andrea, (un)fortunately kernel has control over where initrd is loaded - in arch/i386/boot/setup.S there is setupseg structure, and since header version 2.03 it contains ramdisk_max field, which is set to (-__PAGE_OFFSET-(512 << 20)-1) & 0x7FFFFFFF. Which for default PAGE_OFFSET of 0xC0000000 gives: (-3G - 512MB - 1) mod 2G = 1G - 512M - 1 = 512M - 1... And grub tries to load kernel as high as possible - so it is put directly below 512MB, in the middle of normal zone :-( I've also just figured out that if you boot kernel with vmalloc area over 512MB then it won't find initrd anymore as it expects initrd only in normal zone. So I was thinking that setting ramdisk_max to 128MB would be good idea, but then I realized that it would break >120MB initrds. So it seems that this bug has to be reassigned to grub to change its heuristic from 'load initrd to as high as possible' to something smarter, like 'load initrd at 64MB as log as it fits there, and if it won't fit then load it as high as possible'. BTW, you may be interested that when 'uppermem 40000' is passed to the grub then your 32bit products actually do work on systems with 63.996GB (65532MB) (I've found that 48GB+ configurations are not supported at all only after filling this bug, sorry about that...) -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 ------- Comment #3 from andrea@novell.com 2007-03-14 18:48 MST ------- Good point, I didn't notice this ramdisk_max parameter in the 2.03 header. grub is using it. moveto = (moveto - len) & 0xfffff000; max_addr = (lh->header == LINUX_MAGIC_SIGNATURE && lh->version >= 0x0203 ? lh->initrd_addr_max : LINUX_INITRD_MAX_ADDRESS); if (moveto + len >= max_addr) moveto = (max_addr - len) & 0xfffff000; So ok, kernel is certainly partly to blame for this 512M thing. But I don't think the right solution is to move the initrd lower. If you move it down, either by lowering ramdisk_max or by making grub smarter, you risk fragmenting memory during the bootmem allocator allocations that may leave partial pages unused. so it would waste a bit of memory, not a big deal but if we're going to change something, we better do it optimal. Loading initrd at the end of the normal zone sounds right. So I think we either have to raise ramddisk_max to 800M or to relocate the initrd manually before initializing the bootmem allocator (the latter sounds a bit safer as it won't clash so easily if you enlarge the vmalloc space). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=254316 Nick Piggin <npiggin@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel-maintainers@forge.provo.novell.com |ak@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 Lars Marowsky-Bree <lmb@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |lmb@novell.com AssignedTo|kernel-maintainers@forge.provo.novell.com |jslaby@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User jslaby@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c6 --- Comment #6 from Jiri Slaby <jslaby@novell.com> 2008-05-24 12:40:37 MDT --- Could you try kernel from: http://labs.suse.cz/jslaby/bug-254316/ ? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 Jiri Slaby <jslaby@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |vandrove@vc.cvut.cz -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User vandrove@vc.cvut.cz added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c7 --- Comment #7 from Petr Vandrovec <vandrove@vc.cvut.cz> 2008-06-08 22:38:47 MDT --- Created an attachment (id=220918) --> (https://bugzilla.novell.com/attachment.cgi?id=220918) 65532MB dmesg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User vandrove@vc.cvut.cz added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c8 --- Comment #8 from Petr Vandrovec <vandrove@vc.cvut.cz> 2008-06-08 22:39:14 MDT --- Created an attachment (id=220919) --> (https://bugzilla.novell.com/attachment.cgi?id=220919) 65536MB dmesg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User vandrove@vc.cvut.cz added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c9 --- Comment #9 from Petr Vandrovec <vandrove@vc.cvut.cz> 2008-06-08 22:39:31 MDT --- Created an attachment (id=220920) --> (https://bugzilla.novell.com/attachment.cgi?id=220920) 70000MB dmesg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User vandrove@vc.cvut.cz added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c10 --- Comment #10 from Petr Vandrovec <vandrove@vc.cvut.cz> 2008-06-08 22:39:48 MDT --- Created an attachment (id=220921) --> (https://bugzilla.novell.com/attachment.cgi?id=220921) 140000MB dmesg -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User vandrove@vc.cvut.cz added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c11 Petr Vandrovec <vandrove@vc.cvut.cz> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |NEW Info Provider|vandrove@vc.cvut.cz | --- Comment #11 from Petr Vandrovec <vandrove@vc.cvut.cz> 2008-06-08 22:41:17 MDT --- Thanks. Works up to some value between 70GB and 140GB, which is good enough for me - VMware never claimed to support more than 64GB with 32bit guests. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User jslaby@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c12 --- Comment #12 from Jiri Slaby <jslaby@novell.com> 2008-06-09 02:39:03 MDT --- You ran out of lowmem space in the last case. The theoretical maximum is 112G. You would need to change 1/3 split to 3/1 or 2/2 or whatever which fits the memmap into. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User jslaby@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c13 --- Comment #13 from Jiri Slaby <jslaby@novell.com> 2008-06-09 09:42:46 MDT --- Created an attachment (id=221061) --> (https://bugzilla.novell.com/attachment.cgi?id=221061) Patch from the testing kernel -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 Jiri Kosina <jkosina@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Info Provider|vandrove@vc.cvut.cz |mge@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 User jslaby@novell.com added comment https://bugzilla.novell.com/show_bug.cgi?id=254316#c16 Jiri Slaby <jslaby@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |WONTFIX --- Comment #16 from Jiri Slaby <jslaby@novell.com> 2008-07-13 01:57:03 MDT --- Closing as wontfix, since we don't support so much memory. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=254316 Jiri Slaby <jslaby@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|RESOLVED |CLOSED -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com