[Bug 1199597] New: gcc-12 miscompiles ovmf so qemu does not start
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Bug ID: 1199597 Summary: gcc-12 miscompiles ovmf so qemu does not start Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Virtualization:Other Assignee: virt-bugs@suse.de Reporter: jslaby@suse.com QA Contact: qa-bugs@suse.de CC: mliska@suse.cz, rguenther@suse.com Found By: --- Blocker: --- qemu is stuck when run (using efi boot) after the recent big update (gcc-12 rebuild). I found out that it matters how much memory is passed to qemu. So this minimal command makes qemu stuck: qemu-kvm -drive file=/dev/null,format=raw -drive if=pflash,format=raw,unit=0,readonly=on,file=/usr/share/qemu/ovmf-x86_64-opensuse-code.bin -m 3000 Using -m 2116 makes it progress. I recompiled ovmf and qemu using gcc11 in ibs://home:jirislaby:gcc11. That makes it work too. Note: it was not enough to compile qemu with gcc-11. I had to build ovmf too. I haven't tried to compile _only_ ovmf with gcc-11 yet. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jlee@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c1 --- Comment #1 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #0)
I haven't tried to compile _only_ ovmf with gcc-11 yet.
OK, it is enough to recompile only ovmf using gcc-11. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c2 --- Comment #2 from Jiri Slaby <jslaby@suse.com> --- Created attachment 858967 --> https://bugzilla.suse.com/attachment.cgi?id=858967&action=edit diff -u <(sort gcc11.log ) <(sort gcc12.log ) A diff of sorted build logs. There is not much a difference. Except gcc 12 produces a couple of these: +lto-wrapper: note: see the ���-flto��� option documentation for more information +lto-wrapper: warning: using serial compilation of 10 LTRANS jobs -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c3 Martin Li��ka <martin.liska@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |martin.liska@suse.com --- Comment #3 from Martin Li��ka <martin.liska@suse.com> --- All right, I can reproduce it, so I'm going to bisect that first. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c4 --- Comment #4 from Jiri Slaby <jslaby@suse.com> --- Created attachment 858969 --> https://bugzilla.suse.com/attachment.cgi?id=858969&action=edit efi debug log of gcc-12 Obtained using -global isa-debugcon.iobase=0x402 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c5 --- Comment #5 from Jiri Slaby <jslaby@suse.com> --- Created attachment 858970 --> https://bugzilla.suse.com/attachment.cgi?id=858970&action=edit efi debug log of gcc-11 (good) PEIMs are loaded at different entry points, the true diff starts after TemporaryRamMigration. The machine is apparently reset with gcc-12, with gcc-11 it proceeds with reinstalling PEIs after migration. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c6 --- Comment #6 from Martin Li��ka <martin.liska@suse.com> --- I've tried building the package without LTO and there's not change, it still hangs. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Michal Suchanek <msuchanek@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |msuchanek@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c7 --- Comment #7 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #3)
All right, I can reproduce it, so I'm going to bisect that first.
Ok, looking forward for the output. I tried to compare the assembly, but the diff is way too large. One of the changes are stack size changes (e.g. 0x58->0x70 in LzmaDec_DecodeReal2) in some functions, another is that gcc 12 eliminated ReadUnaligned16 and WriteUnaligned64 (It doesn't look like it inlined them). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c8 --- Comment #8 from Martin Li��ka <martin.liska@suse.com> --- So, I'm sorry but I can't easily bisect that when building the package. Jiri, is there any simpler way how to build ovmf-x86_64-opensuse-code.bin with a few steps (ideally from source)? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c9 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(jlee@suse.com) --- Comment #9 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #8)
So, I'm sorry but I can't easily bisect that when building the package. Jiri, is there any simpler way how to build ovmf-x86_64-opensuse-code.bin with a few steps (ideally from source)?
Sorry, it's mostly a blackbox for me. Hopefully, Joey knows... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c10 --- Comment #10 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #9)
(In reply to Martin Li��ka from comment #8)
So, I'm sorry but I can't easily bisect that when building the package. Jiri, is there any simpler way how to build ovmf-x86_64-opensuse-code.bin with a few steps (ideally from source)?
Sorry, it's mostly a blackbox for me. Hopefully, Joey knows...
BTW using /usr/share/qemu/ovmf-x86_64-4m.bin (which also provides (separate) debuginfo) is broken the same way. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c11 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(jlee@suse.com) | --- Comment #11 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #8)
So, I'm sorry but I can't easily bisect that when building the package. Jiri, is there any simpler way how to build ovmf-x86_64-opensuse-code.bin with a few steps (ideally from source)?
OK, I can do it with this: $ osc co openSUSE:Factory/ovmf && cd $_ $ osc build --root=/dev/shm/jslaby/xxx -x gcc11 when the buildroot is ready, I did ^C. $ sudo chroot /dev/shm/jslaby/xxx there, I had to (once) chmod 1777 /dev/shm/ su abuild cd /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202 source edksetup.sh build -D SECURE_BOOT_ENABLE -D TPM2_ENABLE -D TPM2_CONFIG_ENABLE -D NETWORK_IP6_ENABLE -D NETWORK_HTTP_BOOT_ENABLE -a X64 -b DEBUG -t GCC5 -p OvmfPkg/OvmfPkgX64.dsc -D FD_SIZE_4MB -D NETWORK_TLS_ENABLE It'd build the image, so it's here (the host path): /dev/shm/jslaby/xxx//home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd So: qemu-kvm -drive file=/dev/null,format=raw -drive if=pflash,format=raw,unit=0,readonly=on,file=/dev/shm/jslaby/xxx//home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/FV/OVMF.fd -m 3000 -global isa-debugcon.iobase=0x402 -debugcon stdio The command output should loop (repeated resets). Now, if I change: /usr/bin/gcc -> gcc-12 to /usr/bin/gcc -> gcc-11 Everything is allright. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c12 --- Comment #12 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #11)
Now, if I change: /usr/bin/gcc -> gcc-12 to /usr/bin/gcc -> gcc-11
(And run the build -D... command here again.)
Everything is allright.
-- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c13 --- Comment #13 from Martin Li��ka <martin.liska@suse.com> --- Thanks, works for me and I'm bisecting right now.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c14 --- Comment #14 from Martin Li��ka <martin.liska@suse.com> --- (In reply to Martin Li��ka from comment #6)
I've tried building the package without LTO and there's not change, it still hangs.
I was not correct. The project uses LTO by default and when I really dropped -flto, it works fine. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c15 --- Comment #15 from Martin Li��ka <martin.liska@suse.com> ---
Ok, looking forward for the output. I tried to compare the assembly, but the diff is way too large.
For what file did you make the diff? What's the final ELF file that is then transformed to OVMF.fd? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c16 --- Comment #16 from Martin Li��ka <martin.liska@suse.com> --- So it started with GCC's revision https://github.com/gcc-mirror/gcc/commit/a7acb6dca941db2b1c135107dac3a34a206... which is a register allocation costing change. So it would be handy to know how to compare assembly output. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c17 --- Comment #17 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #16)
So it started with GCC's revision https://github.com/gcc-mirror/gcc/commit/ a7acb6dca941db2b1c135107dac3a34a20650d5c which is a register allocation costing change. So it would be handy to know how to compare assembly output.
Since it is looping in SecMain, I diffed: /dev/shm/jslaby/xxx/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/SecMain.debug -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Dario Faggioli <dfaggioli@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dfaggioli@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c18 --- Comment #18 from Martin Li��ka <martin.liska@suse.com> --- (In reply to Jiri Slaby from comment #17)
(In reply to Martin Li��ka from comment #16)
So it started with GCC's revision https://github.com/gcc-mirror/gcc/commit/ a7acb6dca941db2b1c135107dac3a34a20650d5c which is a register allocation costing change. So it would be handy to know how to compare assembly output.
Since it is looping in SecMain, I diffed: /dev/shm/jslaby/xxx/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/ OvmfX64/DEBUG_GCC5/X64/SecMain.debug
Thanks for the hint. I see a pretty big difference before and after the revision for this ELF. However, the following seems fishy: "gcc" -o /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/SecMain.dll -nostdlib -Wl,-n,-q,--gc-sections -z common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -u _ModuleEntryPoint -Wl,-Map,/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/SecMain.map,--whole-archive -Wl,-melf_x86_64,--oformat=elf64-x86-64,-pie -Os -Wl,--start-group,@/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/OUTPUT/static_library_files.lst,--end-group -g -Os -fshort-wchar -fno-builtin -fno-strict-aliasing -Wall -Werror -Wno-array-bounds -include AutoGen.h -fno-common -Wno-error -flto=auto -ffunction-sections -fdata-sections -DSTRING_ARRAY_NAME=SecMainStrings -m64 -fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -maccumulate-outgoing-args -mno-red-zone -Wno-address -mcmodel=small -fpie -fno-asynchronous-unwind-tables -Wno-address -DUSING_LTO -Os -mno-mmx -mno-sse -D DISABLE_NEW_DEPRECATED_INTERFACES -D ENABLE_MD5_DEPRECATED_INTERFACES -Wl,--defsym=PECOFF_HEADER_SIZE=0x228 -Wl,--script=/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/BaseTools/Scripts/GccBase.lds -Wno-error /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdePkg/Include/Library/PeimEntryPoint.h:74:1: warning: type of 'ProcessLibraryConstructorList' does not match original declaration [-Wlto-type-mismatch] 74 | ProcessLibraryConstructorList ( | ^ /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/AutoGen.c:252:1: note: type mismatch in parameter 1 252 | ProcessLibraryConstructorList ( | ^ /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/AutoGen.c:252:1: note: type 'void' should match type 'void *' /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/AutoGen.c:252:1: note: 'ProcessLibraryConstructorList' was previously declared here /usr/bin/ld: /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/UefiCpuPkg/Library/CpuExceptionHandlerLib/SecPeiCpuExceptionHandlerLib/OUTPUT/SecPeiCpuExceptionHandlerLib.lib(ExceptionHandlerAsm.obj): warning: relocation in read-only section `.text' /usr/bin/ld: warning: creating DT_TEXTREL in a PIE Where the function signature differ: VOID EFIAPI ProcessLibraryConstructorList ( IN EFI_PEI_FILE_HANDLE FileHandle, IN CONST EFI_PEI_SERVICES **PeiServices ); VOID EFIAPI ProcessLibraryConstructorList ( VOID ) { It seems the build system disable the warning intentionally for some build configurations: Conf/tools_def.txt: DEBUG_GCC5_AARCH64_DLINK_FLAGS = DEF(GCC5_AARCH64_DLINK_FLAGS) -Os -L$(WORKSPACE)/ArmPkg/Library/GccLto -llto-aarch64 -Wl,-plugin-opt=-pass-through=-llto-aarch64 -Wno-lto-type-mismatch Conf/tools_def.txt:RELEASE_GCC5_AARCH64_DLINK_FLAGS = DEF(GCC5_AARCH64_DLINK_FLAGS) -Os -L$(WORKSPACE)/ArmPkg/Library/GccLto -llto-aarch64 -Wl,-plugin-opt=-pass-through=-llto-aarch64 -Wno-lto-type-mismatch Note the warning is serious and can easily lead to a miscompilation.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c19 --- Comment #19 from Martin Li��ka <martin.liska@suse.com> --- I was able to modify that in: https://github.com/tianocore/edk2/blob/master/BaseTools/Source/Python/AutoGe... The LTO warning is gone while the miscompilation is still there.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c20 --- Comment #20 from Martin Li��ka <martin.liska@suse.com> --- Note the culprit GCC revision is about the register allocation, so maybe there's some violation coming from ./OvmfPkg/Sec/X64/SecEntry.nasm assembly wrapper? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c21 --- Comment #21 from Joey Lee <jlee@suse.com> --- This issue can be reproduced with the following command with debug log: qemu-kvm -m 3000 -serial mon:stdio -nographic -pflash /usr/share/qemu/ovmf-x86_64-opensuse-code.bin -debugcon file:/tmp/debug.log -global isa-debugcon.iobase=0x402 Or with local build OVMF: qemu-kvm -m 3000 -serial mon:stdio -nographic -pflash ./Build/OvmfX64/DEBUG_GCC5/FV/OVMF_CODE.fd -debugcon file:/tmp/debug.log -global isa-debugcon.iobase=0x402 Base on the OVMF debug log, the looping is because TemporaryRamMigration() fail: SecCoreStartupWithStack(0xFFFCC000, 0x820000) Register PPI Notify: DCD0BE23-9586-40F4-B643-06522CED4EDE Install PPI: 8C8CE578-8A3D-4F1C-9935-896185C32DD3 Install PPI: 5473C07A-3DCB-4DCA-BD6F-1E9689E7349A [...snip] Stack Hob: BaseAddress=0xB7736000 Length=0x20000 Heap Offset = 0xB6F46000 Stack Offset = 0xB6F36000 TemporaryRamMigration(0x810000, 0xB774E000, 0x10000) <-- latest log SecCoreStartupWithStack(0xFFFCC000, 0x820000) <-- back to the first log again, looping... Normally firmware will transit from SEC phase to PEI phase after TemporaryRamMigration. But when issue happened, it did not transit to PEI phase. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c22 --- Comment #22 from Joey Lee <jlee@suse.com> --- The mainline EDK/OVMF (latest commit: 708620d29d) can also reproduce this issue. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Claudio Fontana <claudio.fontana@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |claudio.fontana@suse.com, | |mjambor@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c23 --- Comment #23 from Jiri Slaby <jslaby@suse.com> --- What a heisenbug. I added few more DEBUG()s to TemporaryRamMigration() and the bug is gone. So I assume it's the register allocation exactly in that function. Investigating... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c24 --- Comment #24 from Joey Lee <jlee@suse.com> --- (In reply to Joey Lee from comment #21)
This issue can be reproduced with the following command with debug log:
qemu-kvm -m 3000 -serial mon:stdio -nographic -pflash /usr/share/qemu/ovmf-x86_64-opensuse-code.bin -debugcon file:/tmp/debug.log -global isa-debugcon.iobase=0x402
Or with local build OVMF:
qemu-kvm -m 3000 -serial mon:stdio -nographic -pflash ./Build/OvmfX64/DEBUG_GCC5/FV/OVMF_CODE.fd -debugcon file:/tmp/debug.log -global isa-debugcon.iobase=0x402
Base on the OVMF debug log, the looping is because TemporaryRamMigration() fail:
I have added more debug log to TemporaryRamMigration(). The TemporaryRamMigration successed and returned. So the problem should happened in other place when firmware transit to PEI phase. On the other hand, 4M OVMF can also produce issue. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c25 --- Comment #25 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #23)
What a heisenbug. I added few more DEBUG()s to TemporaryRamMigration() and the bug is gone.
(In reply to Joey Lee from comment #24)
I have added more debug log to TemporaryRamMigration(). The TemporaryRamMigration successed and returned. So the problem should happened in other place when firmware transit to PEI phase.
Yes, adding a single DEBUG() at the end of TemporaryRamMigration() makes it print and the bug still occurs. So this makes it work for me -- comment one DEBUG and it doesn't. Now, going to look into the generated .s. --- a/OvmfPkg/Sec/SecMain.c 2022-02-21 16:19:40.000000000 +0100 +++ b/OvmfPkg/Sec/SecMain.c 2022-05-18 08:25:45.617538962 +0200 @@ -953,6 +953,8 @@ PermanentMemoryBase, (UINT64)CopySize )); + DEBUG((DEBUG_INFO, "TemporaryRamMigration %d\n", __LINE__)); + DEBUG((DEBUG_INFO, "TemporaryRamMigration %d\n", __LINE__)); OldHeap = (VOID *)(UINTN)TemporaryMemoryBase; NewHeap = (VOID *)((UINTN)PermanentMemoryBase + (CopySize >> 1)); -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c26 --- Comment #26 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #25)
--- a/OvmfPkg/Sec/SecMain.c 2022-02-21 16:19:40.000000000 +0100 +++ b/OvmfPkg/Sec/SecMain.c 2022-05-18 08:25:45.617538962 +0200 @@ -953,6 +953,8 @@ PermanentMemoryBase, (UINT64)CopySize )); + DEBUG((DEBUG_INFO, "TemporaryRamMigration %d\n", __LINE__)); + DEBUG((DEBUG_INFO, "TemporaryRamMigration %d\n", __LINE__));
OldHeap = (VOID *)(UINTN)TemporaryMemoryBase; NewHeap = (VOID *)((UINTN)PermanentMemoryBase + (CopySize >> 1));
Oh well, the marginal difference is that the working code uses %rbp: XX <TemporaryRamMigration>: + push %rbp + mov %rdx,%rbp push %rdi - mov %rdx,%rdi push %rsi mov %r9,%rsi push %rbx mov %r8,%rbx - sub $0x150,%rsp + sub $0x158,%rsp call XX <DebugPrintEnabled> test %al,%al jne XX <TemporaryRamMigrationXX> shr %rsi mov %rbx,%rax - mov %rdi,%rdx + mov %rbp,%rdx mov %rsi,%r8 - lea (%rsi,%rdi,1),%rsi + lea (%rsi,%rbp,1),%rsi lea (%r8,%rbx,1),%rcx sub %rsi,%rax mov %r8,0x38(%rsp) @@ -6387,21 +6388,27 @@ XX <TemporaryRamMigration>: call XX <InternalLongJump> jmp XX <TemporaryRamMigrationXX> mov %r9,0x20(%rsp) - mov $0x40,%ecx + lea 0x23b1(%rip),%rdi # XX <.LC94> mov %r8,%r9 + mov $0x40,%ecx mov %rdx,%r8 - lea 0x236b(%rip),%rdx # XX <.LC93> + lea 0x2373(%rip),%rdx # XX <.LC93> call XX <DebugPrint> + mov %rdi,%rdx mov $0x3bc,%r8d mov $0x40,%ecx - lea 0x2380(%rip),%rdx # XX <.LC94> + call XX <DebugPrint> + mov $0x3bd,%r8d + mov %rdi,%rdx + mov $0x40,%ecx call XX <DebugPrint> jmp XX <TemporaryRamMigrationXX> - add $0x150,%rsp + add $0x158,%rsp xor %eax,%eax pop %rbx pop %rsi pop %rdi + pop %rbp ret -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c27 --- Comment #27 from Martin Li��ka <martin.liska@suse.com> ---
Oh well, the marginal difference is that the working code uses %rbp: XX <TemporaryRamMigration>: + push %rbp + mov %rdx,%rbp push %rdi - mov %rdx,%rdi push %rsi mov %r9,%rsi push %rbx mov %r8,%rbx - sub $0x150,%rsp + sub $0x158,%rsp call XX <DebugPrintEnabled> test %al,%al jne XX <TemporaryRamMigrationXX> shr %rsi mov %rbx,%rax - mov %rdi,%rdx + mov %rbp,%rdx mov %rsi,%r8 - lea (%rsi,%rdi,1),%rsi + lea (%rsi,%rbp,1),%rsi lea (%r8,%rbx,1),%rcx sub %rsi,%rax mov %r8,0x38(%rsp) @@ -6387,21 +6388,27 @@ XX <TemporaryRamMigration>: call XX <InternalLongJump> jmp XX <TemporaryRamMigrationXX> mov %r9,0x20(%rsp) - mov $0x40,%ecx + lea 0x23b1(%rip),%rdi # XX <.LC94> mov %r8,%r9 + mov $0x40,%ecx mov %rdx,%r8 - lea 0x236b(%rip),%rdx # XX <.LC93> + lea 0x2373(%rip),%rdx # XX <.LC93> call XX <DebugPrint> + mov %rdi,%rdx mov $0x3bc,%r8d mov $0x40,%ecx - lea 0x2380(%rip),%rdx # XX <.LC94> + call XX <DebugPrint> + mov $0x3bd,%r8d + mov %rdi,%rdx + mov $0x40,%ecx call XX <DebugPrint> jmp XX <TemporaryRamMigrationXX> - add $0x150,%rsp + add $0x158,%rsp xor %eax,%eax pop %rbx pop %rsi pop %rdi + pop %rbp ret
I can confirm that, it's really weird. The bad code doesn't use $rbp and uses smaller stack frame. There's something fishy somewhere.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c28 --- Comment #28 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #27)
I can confirm that, it's really weird. The bad code doesn't use $rbp and uses smaller stack frame. There's something fishy somewhere..
gcc-11 doesn't use %rbp either, but works. But gcc-11 does not touch %rdi -- the same as gcc-12-good. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c29 --- Comment #29 from Joey Lee <jlee@suse.com> --- (In reply to Jiri Slaby from comment #28)
(In reply to Martin Li��ka from comment #27)
I can confirm that, it's really weird. The bad code doesn't use $rbp and uses smaller stack frame. There's something fishy somewhere..
gcc-11 doesn't use %rbp either, but works. But gcc-11 does not touch %rdi -- the same as gcc-12-good.
Just for note, in the Calling Conventions section in UEFI spec. The RDI is nonvolatile and not be used to pass argument: 2.3.4.2 Detailed Calling Conventions The caller passes the first four integer arguments in registers. The integer values are passed from left to right in Rcx, Rdx, R8, and R9 registers. The caller passes arguments five and above onto the stack. All arguments must be right-justified in the register in which they are passed. This ensures the callee can process only the bits in the register that are required. [...snip] The registers Rax, Rcx Rdx R8, R9, R10, R11, and XMM0-XMM5 are volatile and are, therefore, destroyed on function calls. The registers RBX, RBP, RDI, RSI, R12, R13, R14, R15, and XMM6-XMM15 are considered nonvolatile and must be saved and restored by a function that uses them. ... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c30 --- Comment #30 from Richard Biener <rguenther@suse.com> --- (In reply to Joey Lee from comment #29)
(In reply to Jiri Slaby from comment #28)
(In reply to Martin Li��ka from comment #27)
I can confirm that, it's really weird. The bad code doesn't use $rbp and uses smaller stack frame. There's something fishy somewhere..
gcc-11 doesn't use %rbp either, but works. But gcc-11 does not touch %rdi -- the same as gcc-12-good.
Just for note, in the Calling Conventions section in UEFI spec. The RDI is nonvolatile and not be used to pass argument:
2.3.4.2 Detailed Calling Conventions
The caller passes the first four integer arguments in registers. The integer values are passed from left to right in Rcx, Rdx, R8, and R9 registers. The caller passes arguments five and above onto the stack. All arguments must be right-justified in the register in which they are passed. This ensures the callee can process only the bits in the register that are required. [...snip] The registers Rax, Rcx Rdx R8, R9, R10, R11, and XMM0-XMM5 are volatile and are, therefore, destroyed on function calls. The registers RBX, RBP, RDI, RSI, R12, R13, R14, R15, and XMM6-XMM15 are considered nonvolatile and must be saved and restored by a function that uses them. ...
How does the code ensure this? I see no -ffixed-rdi, does the code declare global register variables with it? Can somebody attach preprocessed source of the TU with this function? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c31 --- Comment #31 from Martin Li��ka <martin.liska@suse.com> ---
How does the code ensure this? I see no -ffixed-rdi, does the code declare global register variables with it? Can somebody attach preprocessed source of the TU with this function?
They use __attribute__((ms_abi)) here and there.. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c32 --- Comment #32 from Martin Li��ka <martin.liska@suse.com> --- Created attachment 859016 --> https://bugzilla.suse.com/attachment.cgi?id=859016&action=edit good.s for SecMain.dll -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c33 --- Comment #33 from Martin Li��ka <martin.liska@suse.com> --- Created attachment 859017 --> https://bugzilla.suse.com/attachment.cgi?id=859017&action=edit bad.s for SecMain.dll Created with: "gcc" -o /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/SecMain.dll -nostdlib -Wl,-n,-q,--gc-sections -z common-page-size=0x40 -Wl,--entry,_ModuleEntryPoint -u _ModuleEntryPoint -Wl,-Map,/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/DEBUG/SecMain.map,--whole-archive -Wl,-melf_x86_64,--oformat=elf64-x86-64,-pie -Os -Wl,--start-group,@/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfX64/DEBUG_GCC5/X64/OvmfPkg/Sec/SecMain/OUTPUT/static_library_files.lst,--end-group -g -Os -fshort-wchar -fno-builtin -fno-strict-aliasing -Wall -Werror -Wno-array-bounds -include AutoGen.h -fno-common -Wno-error -flto=auto -ffunction-sections -fdata-sections -DSTRING_ARRAY_NAME=SecMainStrings -m64 -fno-stack-protector "-DEFIAPI=__attribute__((ms_abi))" -maccumulate-outgoing-args -mno-red-zone -Wno-address -mcmodel=small -fpie -fno-asynchronous-unwind-tables -Wno-address -DUSING_LTO -Os -mno-mmx -mno-sse -D DISABLE_NEW_DEPRECATED_INTERFACES -D ENABLE_MD5_DEPRECATED_INTERFACES -Wl,--defsym=PECOFF_HEADER_SIZE=0x228 -Wl,--script=/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/BaseTools/Scripts/GccBase.lds -Wno-error -flto-partition=one --save-temps --verbose -g0 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c34 --- Comment #34 from Richard Biener <rguenther@suse.com> --- Oh, but I note that GCC pushes/pops %rdi so using %rdi should be fine then. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c35 --- Comment #35 from Michal Suchanek <msuchanek@suse.com> --- Does it work if the whole thing is compiled with -mabi=ms_abi? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c36 --- Comment #36 from Jiri Slaby <jslaby@suse.com> --- Hmm, actually TemporaryRamMigration overwrites rbp unconditionally and that's the bug IMO. It adds an offset to rbp even if rbp is NOT used as frame pointer And indeed commenting out: //JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset; makes it work too. Marking TemporaryRamMigration() as: __attribute__((optimize("-fno-omit-frame-pointer"))) works too. The code is: if (SetJump (&JumpBuffer) == 0) { #if defined (MDE_CPU_IA32) JumpBuffer.Esp = JumpBuffer.Esp + DebugAgentContext.StackMigrateOffset; JumpBuffer.Ebp = JumpBuffer.Ebp + DebugAgentContext.StackMigrateOffset; #endif #if defined (MDE_CPU_X64) JumpBuffer.Rsp = JumpBuffer.Rsp + DebugAgentContext.StackMigrateOffset; JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset; #endif LongJump (&JumpBuffer, (UINTN)-1); } It was only coincidence this ever worked -- gcc-11 omits frame pointer too, but apparently the caller does not use rbp. Who is the caller of TemporaryRamMigration() BTW? There is an indirection and I fail to look that up. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c37 --- Comment #37 from Martin Li��ka <martin.liska@suse.com> --- (In reply to Michal Suchanek from comment #35)
Does it work if the whole thing is compiled with -mabi=ms_abi?
No, adding -mabi=ms to GCC_ALL_CC_FLAGS does not help. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c38 --- Comment #38 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #36)
Who is the caller of TemporaryRamMigration() BTW?
PeiCheckAndSwitchStack() that is (gcc-12): 79a6: 4c 29 fd sub %r15,%rbp <------ used rbp 79a9: 4d 29 fe sub %r15,%r14 79ac: 48 83 ec 20 sub $0x20,%rsp 79b0: 4d 89 e0 mov %r12,%r8 79b3: 48 8d 4b 08 lea 0x8(%rbx),%rcx 79b7: 48 8b 44 24 50 mov 0x50(%rsp),%rax 79bc: 48 8b 54 24 20 mov 0x20(%rsp),%rdx 79c1: 4d 29 e8 sub %r13,%r8 79c4: 4c 8b 4c 24 30 mov 0x30(%rsp),%r9 79c9: ff 10 call *(%rax) <----------- call to TemporaryRamMigration 79cb: 48 83 c4 20 add $0x20,%rsp 79cf: be 01 00 00 00 mov $0x1,%esi 79d4: 4c 89 f7 mov %r14,%rdi 79d7: e8 f4 a8 ff ff call 22d0 <MigrateMemoryPages> 79dc: 48 83 ec 20 sub $0x20,%rsp 79e0: 4d 89 f0 mov %r14,%r8 79e3: 31 d2 xor %edx,%edx 79e5: 48 89 e9 mov %rbp,%rcx <------ rbp used gcc-11 seems to copy rbp to r8 first and operates on r8 there. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c39 --- Comment #39 from Jiri Slaby <jslaby@suse.com> --- Now, what is the right fix? Do the setjump/longjump in assembly and wrap it into push rbp/pop rbp? Mark the function as __attribute__((optimize("-fno-omit-frame-pointer"))) -- does it ensure, the rbp is be saved/restored properly? Any other idea? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c40 --- Comment #40 from Martin Li��ka <martin.liska@suse.com> --- (In reply to Jiri Slaby from comment #39) I would report that to upstream and let them pick up a proper fix.
Now, what is the right fix? Do the setjump/longjump in assembly and wrap it into push rbp/pop rbp?
This seems to work fine.
Mark the function as __attribute__((optimize("-fno-omit-frame-pointer"))) -- does it ensure, the rbp is be saved/restored properly?
Do you mean in TemporaryRamMigration, right?
Any other idea?
-- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c41 --- Comment #41 from Martin Li��ka <martin.liska@suse.com> ---
Mark the function as __attribute__((optimize("-fno-omit-frame-pointer"))) -- does it ensure, the rbp is be saved/restored properly?
I don't think so: man gcc: -fomit-frame-pointer ... Note that -fno-omit-frame-pointer doesn't guarantee the frame pointer is used in all functions. Several targets always omit the frame pointer in leaf functions. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c42 --- Comment #42 from Jiri Slaby <jslaby@suse.com> --- Created attachment 859051 --> https://bugzilla.suse.com/attachment.cgi?id=859051&action=edit e-mail to devel@edk2.groups.io (In reply to Martin Li��ka from comment #40)
I would report that to upstream and let them pick up a proper fix.
I've sent this to devel@edk2.groups.io. Not sure if it goes through. Reassigning to Joey who likely has an edk bugzilla account (they are created only upon request -- WTF?). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|virt-bugs@suse.de |jlee@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|gcc-12 miscompiles ovmf so |ovmf contains UB so qemu |qemu does not start |does not start (triggered | |by gcc-12) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c43 --- Comment #43 from Joey Lee <jlee@suse.com> --- (In reply to Jiri Slaby from comment #42)
Created attachment 859051 [details] e-mail to devel@edk2.groups.io
(In reply to Martin Li��ka from comment #40)
I would report that to upstream and let them pick up a proper fix.
I've sent this to devel@edk2.groups.io. Not sure if it goes through. Reassigning to Joey who likely has an edk bugzilla account (they are created only upon request -- WTF?).
Thanks for Jiri's help! I will follow up. Currently I didn't see a good way to fix it. Removing "JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset" will cause source source level debugging has problem. Looks that we need to find a way to avoid the RBP be used in PeiCheckAndSwitchStack(). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c44 --- Comment #44 from Joey Lee <jlee@suse.com> --- (In reply to Jiri Slaby from comment #42)
Created attachment 859051 [details] e-mail to devel@edk2.groups.io
(In reply to Martin Li��ka from comment #40)
I would report that to upstream and let them pick up a proper fix.
I've sent this to devel@edk2.groups.io. Not sure if it goes through. Reassigning to Joey who likely has an edk bugzilla account (they are created only upon request -- WTF?).
hm... I also can not login to EDK2 bugzilla by edk2.groups.io account. I will request to create a bugzilla account. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c45 --- Comment #45 from Jiri Slaby <jslaby@suse.com> --- (In reply to Joey Lee from comment #43)
Removing "JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset" will cause source source level debugging has problem.
Why do you think so? ovmf is built without frame pointers anyway. So rbp is used everywhere as a general purpose register (GPR). (The unwinding now relies on dwarf unwinder, not on the frame pointer one.)
Looks that we need to find a way to avoid the RBP be used in PeiCheckAndSwitchStack().
That won't help much as caller of PeiCheckAndSwitchStack() (or any its callee) still can use rbp as GPR. In kernel, we have CONFIG_FRAME_POINTER. And either we take rbp into account (if set) or not. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c46 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo? --- Comment #46 from Jiri Slaby <jslaby@suse.com> --- I've just realized that rbp is a non-volatile register and should be callee saved if modified. So we should likely somehow force the compiler to generate push rbp/pop rbp sequence in TemporaryRamMigration() provided we modify it. The question is how? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c47 --- Comment #47 from Martin Li��ka <martin.liska@suse.com> --- (In reply to Jiri Slaby from comment #46)
I've just realized that rbp is a non-volatile register and should be callee saved if modified. So we should likely somehow force the compiler to generate push rbp/pop rbp sequence in TemporaryRamMigration() provided we modify it.
The question is how?
What about: asm volatile ("pushq %rbp"); ... asm volatile ("popq %rbp"); ? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c48 --- Comment #48 from Jiri Slaby <jslaby@suse.com> --- (In reply to Martin Li��ka from comment #47)
asm volatile ("pushq %rbp"); ... asm volatile ("popq %rbp");
First, that would need to be 32/64 bit dependent (ebp/rpb). Second, it would break TemporaryRamMigration() when it uses bp as a frame pointer. As it would reference the old location of the stack after pop bp. The same would happen if the bp switch is done in assembly and is wrapped around SetJmp/LongJmp as I suggested in comment 39. I actually don't know how to fix it properly :/. Unless we know for sure CC uses FP for that function or not. Not sure how the kernel's CONFIG_FRAME_POINTER work in this regard. As that is exactly what would be needed here -- do that: JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset only when the function uses FP (i.e. rbp to reference the stack). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c49 --- Comment #49 from Richard Biener <rguenther@suse.com> --- (In reply to Jiri Slaby from comment #48)
(In reply to Martin Li��ka from comment #47)
asm volatile ("pushq %rbp"); ... asm volatile ("popq %rbp");
First, that would need to be 32/64 bit dependent (ebp/rpb).
Second, it would break TemporaryRamMigration() when it uses bp as a frame pointer. As it would reference the old location of the stack after pop bp. The same would happen if the bp switch is done in assembly and is wrapped around SetJmp/LongJmp as I suggested in comment 39.
I actually don't know how to fix it properly :/. Unless we know for sure CC uses FP for that function or not. Not sure how the kernel's CONFIG_FRAME_POINTER work in this regard. As that is exactly what would be needed here -- do that: JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset only when the function uses FP (i.e. rbp to reference the stack).
I think %rbp will be saved/restored by the pro-/epilogue if you make use of it, so instead of a push/pop just do asm volatile ("" : : : "rbp"); unfortunately(?) that requires -fomit-frame-pointer (at least for the select function using it). void foo () { asm volatile ("# bar" ::: "rbp"); }
gcc -S t.c -mabi=ms -fomit-frame-pointer -O2 cat t.s .file "t.c" .text .p2align 4,,15 .globl foo .type foo, @function foo: .LFB0: .cfi_startproc pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset 6, -16 #APP # 3 "t.c" 1 # bar # 0 "" 2 #NO_APP popq %rbp .cfi_restore 6 .cfi_def_cfa_offset 8 ret .cfi_endproc .LFE0: .size foo, .-foo .ident "GCC: (SUSE Linux) 7.5.0" .section .note.GNU-stack,"",@progbits
so try if that's enough. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c50 --- Comment #50 from Jiri Slaby <jslaby@suse.com> --- (In reply to Richard Biener from comment #49)
(In reply to Jiri Slaby from comment #48)
(In reply to Martin Li��ka from comment #47)
asm volatile ("pushq %rbp"); ... asm volatile ("popq %rbp");
First, that would need to be 32/64 bit dependent (ebp/rpb).
Second, it would break TemporaryRamMigration() when it uses bp as a frame pointer. As it would reference the old location of the stack after pop bp. The same would happen if the bp switch is done in assembly and is wrapped around SetJmp/LongJmp as I suggested in comment 39.
I actually don't know how to fix it properly :/. Unless we know for sure CC uses FP for that function or not. Not sure how the kernel's CONFIG_FRAME_POINTER work in this regard. As that is exactly what would be needed here -- do that: JumpBuffer.Rbp = JumpBuffer.Rbp + DebugAgentContext.StackMigrateOffset only when the function uses FP (i.e. rbp to reference the stack).
I think %rbp will be saved/restored by the pro-/epilogue if you make use of it, so instead of a push/pop just do
asm volatile ("" : : : "rbp");
I don't know why, but: [ 62s] "gcc" -MMD -MF /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfIa32/DEBUG_GCC5/IA32/OvmfPkg/Sec/SecMain/OUTPUT/SecMain.obj.deps -g -Os -fshort-wchar -fno-builtin -fno-strict-aliasing -Wall -Werror -Wno-array-bounds -include AutoGen.h -fno-common -ffunction-sections -fdata-sections -DSTRING_ARRAY_NAME=SecMainStrings -m32 -march=i586 -malign-double -fno-stack-protector -D EFI32 -fno-asynchronous-unwind-tables -Wno-address -fno-pic -fno-pie -fno-pic -fno-pie -flto -Os -mno-mmx -mno-sse -D DISABLE_NEW_DEPRECATED_INTERFACES -D ENABLE_MD5_DEPRECATED_INTERFACES -c -o /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfIa32/DEBUG_GCC5/IA32/OvmfPkg/Sec/SecMain/OUTPUT/./SecMain.obj -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Sec/Ia32 -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Sec -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/Build/OvmfIa32/DEBUG_GCC5/IA32/OvmfPkg/Sec/SecMain/DEBUG -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdePkg -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdePkg/Include -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdePkg/Test/UnitTest/Include -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdePkg/Include/Ia32 -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdeModulePkg -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/MdeModulePkg/Include -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/UefiCpuPkg -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/UefiCpuPkg/Include -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Include -I/home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Csm/Include /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Sec/SecMain.c ... [ 64s] /home/abuild/rpmbuild/BUILD/edk2-edk2-stable202202/OvmfPkg/Sec/SecMain.c:1007:1: error: bp cannot be used in ���asm��� here It appears like -fno-omit-frame-pointer was in place. But their -Os should avoid frame pointers... So? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c51 --- Comment #51 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #50)
[ 62s] "gcc" ... -m32 -march=i586
And: EFI_STATUS __attribute__((cdecl)) TemporaryRamMigration ( const EFI_PEI_SERVICES **PeiServices, EFI_PHYSICAL_ADDRESS TemporaryMemoryBase, EFI_PHYSICAL_ADDRESS PermanentMemoryBase, UINTN CopySize ) { That source is built multiple times, for x86_32 using cdecl which doesn't allow mentioning bp in asm as (IMO) frame pointer cannot be omitted. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c52 --- Comment #52 from Richard Biener <rguenther@suse.com> --- I think the error: bp cannot be used in ���asm��� here is likely because the frame pointer cannot be eliminated with the asm in place? So it probably needs an explicit -fomit-frame-pointer -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c53 --- Comment #53 from Joey Lee <jlee@suse.com> --- Hi Jiri, (In reply to Jiri Slaby from comment #42)
Created attachment 859051 [details] e-mail to devel@edk2.groups.io
(In reply to Martin Li��ka from comment #40)
I would report that to upstream and let them pick up a proper fix.
I've sent this to devel@edk2.groups.io. Not sure if it goes through. Reassigning to Joey who likely has an edk bugzilla account (they are created only upon request -- WTF?).
I didn't receive your mail from devel@edk2.groups.io. And I also can not search out it from archive: https://edk2.groups.io/g/devel/messages?start=5:2022:420 Did you send it? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c54 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo? | --- Comment #54 from Jiri Slaby <jslaby@suse.com> --- (In reply to Joey Lee from comment #53)
I didn't receive your mail from devel@edk2.groups.io. And I also can not search out it from archive:
https://edk2.groups.io/g/devel/messages?start=5:2022:420
Did you send it?
Hi, yes, comment 42 says its message-id is: 887c3f4f-c279-bd59-d92d-25922faae6dc@kernel.org But the list is moderated and I am not subscribed. So either they dropped it or haven't approved it yet. Feel free to send it from your e-mail if you are subscribed, possibly adding more info from the recent comments. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c55 --- Comment #55 from Joey Lee <jlee@suse.com> --- (In reply to Jiri Slaby from comment #54)
(In reply to Joey Lee from comment #53)
I didn't receive your mail from devel@edk2.groups.io. And I also can not search out it from archive:
https://edk2.groups.io/g/devel/messages?start=5:2022:420
Did you send it?
Hi, yes, comment 42 says its message-id is: 887c3f4f-c279-bd59-d92d-25922faae6dc@kernel.org
But the list is moderated and I am not subscribed. So either they dropped it or haven't approved it yet. Feel free to send it from your e-mail if you are subscribed, possibly adding more info from the recent comments.
I got account of tianocore bugzilla. I have filed this issue on upstream: https://bugzilla.tianocore.org/show_bug.cgi?id=3934 Bug 3934 - ovmf miscompiles with gcc-12 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c56 --- Comment #56 from Jiri Slaby <jslaby@suse.com> --- There is a PR to fix this: https://github.com/tianocore/edk2/pull/2954 Similar to this: (In reply to Jiri Slaby from comment #45)
In kernel, we have CONFIG_FRAME_POINTER. And either we take rbp into account (if set) or not.
-- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c57 --- Comment #57 from Joey Lee <jlee@suse.com> --- (In reply to Jiri Slaby from comment #56)
There is a PR to fix this: https://github.com/tianocore/edk2/pull/2954
Similar to this:
(In reply to Jiri Slaby from comment #45)
In kernel, we have CONFIG_FRAME_POINTER. And either we take rbp into account (if set) or not.
hm... The patch be reverted. https://github.com/tianocore/edk2/pull/2968 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c58 --- Comment #58 from Jiri Slaby <jslaby@suse.com> --- (In reply to Joey Lee from comment #57)
(In reply to Jiri Slaby from comment #56)
There is a PR to fix this: https://github.com/tianocore/edk2/pull/2954
Similar to this:
(In reply to Jiri Slaby from comment #45)
In kernel, we have CONFIG_FRAME_POINTER. And either we take rbp into account (if set) or not.
hm... The patch be reverted.
That's the old patch. The one you accepted to ovmf is newer and works: https://build.opensuse.org/package/view_file/Virtualization/ovmf/ovmf-tools_... I.e. it updates GCC48_{IA32,X64}_CC_FLAGS, not GCC_{IA32,X64}_CC_FLAGS. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1199597 https://bugzilla.suse.com/show_bug.cgi?id=1199597#c59 Joey Lee <jlee@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #59 from Joey Lee <jlee@suse.com> --- Patch be merged to openSUSE:Factory. Set this bug to FIXED. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com