[Bug 222898] New: Kexec is broken on x86_64
https://bugzilla.novell.com/show_bug.cgi?id=222898 Summary: Kexec is broken on x86_64 Product: openSUSE 10.2 Version: Beta 2 plus Platform: Other OS/Version: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel AssignedTo: kernel-maintainers@forge.provo.novell.com ReportedBy: bwalle@novell.com QAContact: qa@suse.de I'm running STABLE, and the error message of 'rckdump start' is: Overlapping memory segments at 0x141c000 sort_segments failed The problem is described also in http://lkml.org/lkml/2006/10/5/86 (since we have the patches.xen/x86_64-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch, this is also applicable to our older kernel), but the patch in this posting doesn't work here. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #1 from bwalle@novell.com 2006-11-23 10:33 MST ------- Created an attachment (id=106769) --> (https://bugzilla.novell.com/attachment.cgi?id=106769&action=view) Fix This patch from Takashi Iwai fixes the problem. See also: http://fourier.suse.de/mlarch/SuSE/kernel/2006/kernel.2006.11/msg00393.html -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 gregkh@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|kernel- |tiwai@novell.com |maintainers@forge.provo.nove| |ll.com | -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jblunck@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jblunck@novell.com, jbeulich@novell.com Summary|Kexec is broken on x86_64 |kexec is broken on i386 & x86_64 Version|Beta 2 plus |RC 5 ------- Comment #2 from jblunck@novell.com 2006-12-06 07:15 MST ------- Jan, is it necessary to move the bss segment for XEN? Is it safe to revert this change as this patch is doing it? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jblunck@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |NEEDINFO Info Provider| |jbeulich@novell.com -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #3 from jblunck@novell.com 2006-12-06 07:29 MST ------- Err, you're moving the bss section for all archs EXCEPT XEN. I guess the better question is, why did you moved it at all? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jbeulich@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jbeulich@novell.com | ------- Comment #4 from jbeulich@novell.com 2006-12-06 07:52 MST ------- Hmm, I don't think I agree: x86_64-put-note-sections-into-a-pt_note-segment-in-vmlinux.patch moves it unconditionally, but xen3-fixup-arch-x86_64 restores it to its old position *except* for Xen. While I can't say for sure the move is absolutely needed (I do know that when building the head kernel on older distros [which I do frequently] the original placement causes a problem in ld), given the statement above I can't see why the placement of .bss would matter here. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #5 from bwalle@novell.com 2006-12-19 06:29 MST ------- The placement of bss does matter because if the BSS is placed in the beginning (as it is now without CONFIG_XEN), then the linker puts each section it places in data.init also in data. You can try it out, I don't know why. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #6 from bwalle@novell.com 2006-12-19 06:52 MST ------- I also have now a linker script (with trial and error as I don't find any detailed information in the ld manual that could explain the behaviour) that doesn't have this problem. It's the patch from above without moving BSS unconditionally (i.e. BSS is at the beginning) and: #ifndef CONFIG_XEN __bss_start = .; /* BSS */ .bss : AT(ADDR(.bss) - LOAD_OFFSET) { *(.bss.page_aligned) *(.bss) }:NONE __bss_stop = .; #endif . = ALIGN(PAGE_SIZE); . = ALIGN(CONFIG_X86_L1_CACHE_BYTES); .data.cacheline_aligned : AT(ADDR(.data.cacheline_aligned) - LOAD_OFFSET) { *(.data.cacheline_aligned) } :data So the change is the ":NONE" and the ":data". I'll attach this script to have a complete overview. The result (with readelf) is: 01 .data .bss .data.cacheline_aligned .data.read_mostly 02 .vsyscall_0 .xtime_lock .vxtime .wall_jiffies .sys_tz sysctl_vsyscall .xtime .jiffies .vsyscall_1 .vsyscall_2 .vsyscall_3 03 .data.init_task .data.page_aligned .init.text .init.data .init.setup initcall.init .con_initcall.init .altinstructions .altinstr_replacement exit.text .init.ramfs .data_nosave 04 This looks well (CONFIG_DEBUG_INFO=y). Because our build process extracts the debug information to a separate binary and then strips the original kernel, I simply run 'strip' over the binary. The result of our kernel RPM (--debug=yes) is the same. After the strip, we have 01 .data .bss .data.cacheline_aligned .data.read_mostly .data.init_task data.page_aligned .init.text .init.data .init.setup .initcall.init con_initcall.init .altinstructions .altinstr_replacement .exit.text init.ramfs .data_nosave 02 .vsyscall_0 .xtime_lock .vxtime .wall_jiffies .sys_tz sysctl_vsyscall .xtime .jiffies .vsyscall_1 .vsyscall_2 .vsyscall_3 03 .data.init_task .data.page_aligned .init.text .init.data .init.setup initcall.init .con_initcall.init .altinstructions .altinstr_replacement exit.text .init.ramfs .data_nosave 04 Again, some sections (like .data.init_task) are twice and in the physical address space they are overlapping Type Offset VirtAddr PhysAddr FileSiz MemSiz Flags Align LOAD 0x0000000000200000 0xffffffff81000000 0x0000000001000000 0x0000000000275668 0x0000000000275668 R E 200000 LOAD 0x0000000000476000 0xffffffff81276000 0x0000000001276000 0x00000000000f4718 0x00000000002ebafc RWE 200000 LOAD 0x0000000000600000 0xffffffffff600000 0x000000000136b000 0x0000000000000c08 0x0000000000000c08 RWE 200000 LOAD 0x000000000076c000 0xffffffff8136c000 0x000000000136c000 0x000000000002f004 0x000000000002f004 RWE 200000 NOTE 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 0x0000000000000000 R 8 0x0000000001276000 + 0x00000000002ebafc = 0x1561AFC which is larger than 0x000000000136c000. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #7 from bwalle@novell.com 2006-12-19 06:54 MST ------- Created an attachment (id=110296) --> (https://bugzilla.novell.com/attachment.cgi?id=110296&action=view) The current linker script This is the current linker script I described above -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #8 from bwalle@novell.com 2006-12-19 07:14 MST ------- My proposal to fix the bug for 10.2 would be just to enable the fixes only for kdump kernel (i.e. moving of BSS section and the data.init section). With this, we cannot break anything (since it already is broken :)), and it would probably work. For HEAD we will sooner or later switch to 2.6.19 (or not?) where the problem is already resolved by moving the BSS section and the other patches. Opinions? Thanks. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #9 from jbeulich@novell.com 2006-12-19 07:24 MST ------- I'd be afraid of patch conflicts if this became a conditional patch; I can't see anything wrong with the patch, so I'd say if Andi could have a look at this and agrees, it should becoma an unconditional patch. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #10 from bwalle@novell.com 2006-12-19 07:28 MST ------- @Jan: Do you know _why_ the BSS move is necessary to make the data.init move working? That would be interesting for me, because the intention of BSS move is completely different from why we need it to fix the overlap bug. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #11 from jbeulich@novell.com 2006-12-19 07:31 MST ------- As said above, I don't think the move is strictly necessary, but also as said above it allows me, without extra patching, to build the resulting sources on older distros. That is sufficient reason for me to keep that change. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #12 from bwalle@novell.com 2006-12-19 07:35 MST ------- I think we're talking about different things. The move of BSS _is_ necessary to make the introduction of data.init working. I don't know why, but it simply is. I would like to know _why_ it is, because I don't understand what BSS has to do whether the linker puts sections correctly in data.init (and not also in data). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #13 from jbeulich@novell.com 2006-12-19 07:45 MST ------- Oh, sorry - this is, afaict, due to a binutils problem: On older versions it fails, but even current code doesn't handle @nobits pieces in a segment properly when followed by another segment. It just happens that the normal kernel doesn't get confused by that, but as you see kdump does. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #14 from bwalle@novell.com 2006-12-19 07:55 MST ------- There's no difference between normal kernel and kdump kernel. The only difference is that the kdump kernel is loaded by kexec userspace utility which checks for overlapping physical segments in the ELF header. I don't know what @nobits means (and I didn't find that term in the info page). My question is simply: Why is it necessary to put BSS on the end just that the linker puts for example the section data.init_task only into the segment data.init and not also into data? I'm not talking about the linker script I attached but about the linker script I'll attach now to clarify. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #15 from bwalle@novell.com 2006-12-19 08:00 MST ------- Created an attachment (id=110315) --> (https://bugzilla.novell.com/attachment.cgi?id=110315&action=view) Linker script which doesn't work -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #16 from jbeulich@novell.com 2006-12-19 08:08 MST ------- I guess you'll have to ask a binutils (ld) person to find out. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 haveaniceday@cv-sv.de changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |haveaniceday@cv-sv.de ------- Comment #17 from haveaniceday@cv-sv.de 2006-12-29 13:20 MST ------- The patch "fix" works for i386. Only required for i386: part linux-2.6.18/arch/i386/kernel/vmlinux.lds.S-dist You can use "dumpelf vmlinux-2.6.18.5-*kdump |less " to see the elf relocation addresses. A correct kdump kernel should have a 4096 alligned address for Header #1 p_vaddr/.p_paddr ( 0x?????000 ). All PT_LOAD section should have a 0x1000 alligned address for kexec. Sample: /* Program Header #1 0x54 */ { .p_type = 1 , /* [PT_LOAD] */ .p_offset = 2199552 , .p_vaddr = 0xC1218000 , .p_paddr = 0x1218000 , .p_filesz = 508037 , .p_memsz = 860916 , .p_flags = 7 , .p_align = 4096 }, Current vmlinux-2.6.18.2-34-kdump shows .p_vaddr=0xC1215480,.p_paddr=0x1215480 => 0x....480 => not 0x1000 alligned. I suggest to apply the fix for linux-2.6.18/arch/i386/kernel/vmlinux.lds.S only first. E.g. to the kotd. kexec works for i386 in this case. Thanks, Christian -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #18 from bwalle@novell.com 2007-01-08 09:03 MST ------- Ok, the i386 case is now fixed. Thanks Christian for spotting that out, I forgot i386 completely. ;) Should appear at ftp://ftp.gwdg.de/pub/linux/suse/projects/kernel/kotd/10.2-i386/SL102_BRANCH (or another mirror) in a few days. Don't close the bug because x86_64 is still to fix. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 bwalle@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |jbeulich@novell.com ------- Comment #19 from bwalle@novell.com 2007-01-11 02:27 MST ------- Jan, the kernel with the patch compiles flawslessly on SLES10. Which exact version of binutils is required to build a kernel with BSS at the end? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jbeulich@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEEDINFO |ASSIGNED Info Provider|jbeulich@novell.com | ------- Comment #20 from jbeulich@novell.com 2007-01-11 03:03 MST ------- Of course it does, if the patch you refer to moves .bss to the end unconditionally (all binutils versions can deal with that, it's the bss-in-the-middle that depends on exact binutils version). However, Andi rejected earlier (mainline) patches to do this because supposedly these were causing problems for someone, although it wasn't understood why. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #21 from bwalle@novell.com 2007-01-11 11:12 MST ------- The patch is now mainline since a few month. And I think we can't leave the bug open only because there were some strange unknown problems in the past. Takashi, can you please decide how to proceed? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #22 from tiwai@novell.com 2007-01-11 11:27 MST ------- Well, kexec isn't anyway supported on openSUSE 10.2 with genuine stuff because openSUSE 10.2 DVD has no kexec-tools package. So, it's really user's thing. Of course, I'm in favor of fixing the bug, but if we are not sure, we don't have to spend much time -- it's no security fix after all. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jbeulich@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|ASSIGNED |NEEDINFO Info Provider| |ak@novell.com ------- Comment #26 from jbeulich@novell.com 2007-01-15 01:19 MST ------- I'd really like to get your opinion here first, Andi (also regarding the respective SLE10 bug 234316). -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 ------- Comment #27 from haveaniceday@cv-sv.de 2007-01-15 11:16 MST ------- @Takashi, the repo contains kexec-tools. ftp://ftp.opensuse.org/distribution/10.2/repo/oss/suse/i586/kexec-tools-1.101-57.i586.rpm -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 bwalle@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|tiwai@novell.com |jbeulich@novell.com Status|NEEDINFO |NEW Info Provider|ak@novell.com | ------- Comment #29 from bwalle@novell.com 2007-01-22 16:44 MST ------- Ok, done. Jan, can you adjust the Xen patch? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898 jbeulich@novell.com changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution| |FIXED ------- Comment #30 from jbeulich@novell.com 2007-01-23 01:34 MST ------- Done. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug, or are watching someone who is.
https://bugzilla.novell.com/show_bug.cgi?id=222898#c31 Bernhard Walle <bwalle@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |meissner@novell.com Status|RESOLVED |REOPENED Resolution|FIXED | --- Comment #31 from Bernhard Walle <bwalle@novell.com> 2007-08-20 05:20:07 MST --- Because of that be we need to release "kernel-kdump" in the next maintenance update for the 10.2 kernel. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=222898 Bernhard Walle <bwalle@novell.com> changed: What |Removed |Added ---------------------------------------------------------------------------- AssignedTo|jbeulich@novell.com |kgw@novell.com Status|REOPENED |NEW -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com