[Bug 1133245] New: LTO: libsigsegv build fails
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 Bug ID: 1133245 Summary: LTO: libsigsegv build fails Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: martin.liska@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Fails due to: [ 45s] /usr/include/asm/sigcontext.h:40:8: error: redefinition of 'struct _fpx_sw_bytes' There's some broken configure: Bad configure: checking whether a fault handler according to Hurd works... no checking ucontext.h usability... yes checking ucontext.h presence... yes checking for ucontext.h... yes vs. checking whether a fault handler according to Hurd works... no checking for the fault handler specifics... fault-linux-x86_64-old.h checking if the system supports catching SIGSEGV... no checking for stack direction... grows down -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 Martin Liška <martin.liska@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Blocks| |1133084 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c1 Dr. Werner Fink <werner@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |martin.liska@suse.com Flags| |needinfo?(martin.liska@suse | |.com) --- Comment #1 from Dr. Werner Fink <werner@suse.com> --- Could you please explain what yo are doing? AFAICS in libsigsegv the struct _fpx_sw_bytes is not directly used: libsigsegv/libsigsegv-2.12> grep -rs _fpx_sw_bytes . libsigsegv/libsigsegv-2.12> therefore I suggest to have a look into the system headers /suse/werner> grep -rs _fpx_sw_bytes /usr/include/ /usr/lib64/gcc/ /usr/include/bits/sigcontext.h:struct _fpx_sw_bytes /usr/include/arch-x86/asm/sigcontext.h:struct _fpx_sw_bytes { /usr/include/arch-x86/asm/sigcontext.h: struct _fpx_sw_bytes sw_reserved; /* Potential extended state is encoded here */ /usr/include/arch-x86/asm/sigcontext.h: struct _fpx_sw_bytes sw_reserved; /* Potential extended state is encoded here */ /usr/include/arch-x86/asm/sigcontext.h: * (struct _fpx_sw_bytes) /usr/include/arch-x86/asm/sigcontext.h: * (struct _fpx_sw_bytes) beside this here is the same as with texlive binaries: I do not want to have libsigsegv compiled(linked with lto support! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c2 --- Comment #2 from Martin Liška <martin.liska@suse.com> ---
beside this here is the same as with texlive binaries: I do not want to have libsigsegv compiled(linked with lto support!
I accept that: https://build.opensuse.org/request/show/697610 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 Martin Liška <martin.liska@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(martin.liska@suse | |.com) | -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c4 --- Comment #4 from Dr. Werner Fink <werner@suse.com> --- The problem seems to be that <bits/sigcontect.h> is a copy of <asm/sigcontext.h> but with stripped comments. Interesting that this cause trouble if -flto is used as the definitions are equal and the compiler should not throw an error but only a warning. AFAICR <bits/sigcontect.h> is read by <signal.h> for -D_DEFAULT_SOURCE (which enables __USE_MISC) Beside this I've added your change of submit request 697610 by hand but have to decline it because I was not aware of this request, sorry! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c5 --- Comment #5 from Martin Liška <martin.liska@suse.com> --- I guess the difference is caused by a divergence in configure script: $ diff -u nolto.txt lto.txt --- nolto.txt 2019-04-25 09:19:28.461659689 +0200 +++ lto.txt 2019-04-25 09:20:41.047043653 +0200 @@ -1,5 +1,5 @@ checking for mmap of /dev/zero... yes - checking whether a fault handler according to POSIX works... yes + checking whether a fault handler according to POSIX works... no checking whether a fault handler according to Linux/i386 works... no checking whether a fault handler according to old Linux/i386 works... no checking whether a fault handler according to Linux/m68k works... no @@ -14,11 +14,8 @@ checking whether a fault handler according to MacOSX/Darwin7 PowerPC works... no checking whether a fault handler according to MacOSX/Darwin5 PowerPC works... no checking whether a fault handler according to Hurd works... no - checking ucontext.h usability... yes - checking ucontext.h presence... yes - checking for ucontext.h... yes - checking for the fault handler specifics... fault-linux-i386.h - checking if the system supports catching SIGSEGV... yes + checking for the fault handler specifics... fault-linux-x86_64-old.h + checking if the system supports catching SIGSEGV... no checking for stack direction... grows down checking for prmap_t in sys/procfs.h... no checking for mquery... no -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c6 --- Comment #6 from Martin Liška <martin.liska@suse.com> --- I've been debugging the corresponding conftest.c file. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c7 --- Comment #7 from Martin Liška <martin.liska@suse.com> --- I've got it. So for the following conftest: 1 /* confdefs.h */ 2 #define PACKAGE_NAME "" 3 #define PACKAGE_TARNAME "" 4 #define PACKAGE_VERSION "" 5 #define PACKAGE_STRING "" 6 #define PACKAGE_BUGREPORT "" 7 #define PACKAGE_URL "" 8 #define PACKAGE "libsigsegv" 9 #define VERSION "2.12" 10 #define STDC_HEADERS 1 11 #define HAVE_SYS_TYPES_H 1 12 #define HAVE_SYS_STAT_H 1 13 #define HAVE_STDLIB_H 1 14 #define HAVE_STRING_H 1 15 #define HAVE_MEMORY_H 1 16 #define HAVE_STRINGS_H 1 17 #define HAVE_INTTYPES_H 1 18 #define HAVE_STDINT_H 1 19 #define HAVE_UNISTD_H 1 20 #define HAVE_DLFCN_H 1 21 #define LT_OBJDIR ".libs/" 22 #define HAVE_SYS_SIGNAL_H 1 23 #define CFG_SIGNALS "signals.h" 24 #define HAVE_UNISTD_H 1 25 #define HAVE_GETPAGESIZE 1 26 #define HAVE_SYSCONF_PAGESIZE 1 27 #define HAVE_MMAP_ANON 1 28 #define HAVE_MMAP_ANONYMOUS 1 29 #define HAVE_MMAP_DEVZERO 1 30 /* end confdefs.h. */ 31 32 33 #include <stdlib.h> 34 #include <signal.h> 35 #if HAVE_SYS_SIGNAL_H 36 # include <sys/signal.h> 37 #endif 38 39 #include <sys/types.h> 40 #include <sys/mman.h> 41 #if HAVE_MMAP_DEVZERO 42 # include <fcntl.h> 43 # ifndef MAP_FILE 44 # define MAP_FILE 0 45 # endif 46 #endif 47 #ifndef PROT_NONE 48 # define PROT_NONE 0 49 #endif 50 #if HAVE_MMAP_ANON 51 # define zero_fd -1 52 # define map_flags MAP_ANON | MAP_PRIVATE 53 #elif HAVE_MMAP_ANONYMOUS 54 # define zero_fd -1 55 # define map_flags MAP_ANONYMOUS | MAP_PRIVATE 56 #elif HAVE_MMAP_DEVZERO 57 static int zero_fd; 58 # define map_flags MAP_FILE | MAP_PRIVATE 59 #endif 60 #if defined __NetBSD__ && (defined __sparc__ || defined __sparc64__) 61 /* getpagesize () is 0x1000 or 0x2000, depending on hardware. */ 62 # include <unistd.h> 63 # define SIGSEGV_FAULT_ADDRESS_ROUNDOFF_BITS (getpagesize () - 1) 64 #elif defined __linux__ && (defined __s390__ || defined __s390x__) 65 # define SIGSEGV_FAULT_ADDRESS_ROUNDOFF_BITS (0x1000UL - 1) 66 #else 67 # define SIGSEGV_FAULT_ADDRESS_ROUNDOFF_BITS 0UL 68 #endif 69 unsigned long page; 70 int handler_called = 0; 71 void sigsegv_handler (int sig, siginfo_t *sip, void *ucp) 72 { 73 void *fault_address = (void *) (sip->si_addr); 74 handler_called++; 75 if (handler_called == 10) 76 exit (4); 77 if (fault_address 78 != (void*)((page + 0x678) & ~SIGSEGV_FAULT_ADDRESS_ROUNDOFF_BITS)) 79 exit (3); 80 if (mprotect ((void *) page, 0x10000, PROT_READ | PROT_WRITE) < 0) 81 exit (2); 82 } 83 void crasher (unsigned long p) 84 { 85 *(int *) (p + 0x678) = 42; 86 } 87 int main () 88 { 89 void *p; 90 struct sigaction action; 91 /* Preparations. */ 92 #if !HAVE_MMAP_ANON && !HAVE_MMAP_ANONYMOUS && HAVE_MMAP_DEVZERO 93 zero_fd = open ("/dev/zero", O_RDONLY, 0644); 94 #endif 95 /* Setup some mmaped memory. */ 96 #ifdef __hpux 97 /* HP-UX 10 mmap() often fails when given a hint. So give the OS complete 98 freedom about the address range. */ 99 p = mmap ((void *) 0, 0x10000, PROT_READ | PROT_WRITE, map_flags, zero_fd, 0); 100 #else 101 p = mmap ((void *) 0x12340000, 0x10000, PROT_READ | PROT_WRITE, map_flags, zero_fd, 0); 102 #endif 103 if (p == (void *)(-1)) 104 exit (2); 105 page = (unsigned long) p; 106 /* Make it read-only. */ 107 #if defined __linux__ && defined __sparc__ 108 /* On Linux 2.6.26/SPARC64, PROT_READ has the same effect as 109 PROT_READ | PROT_WRITE. */ 110 if (mprotect ((void *) page, 0x10000, PROT_NONE) < 0) 111 #else 112 if (mprotect ((void *) page, 0x10000, PROT_READ) < 0) 113 #endif 114 exit (2); 115 /* Install the SIGSEGV handler. */ 116 sigemptyset(&action.sa_mask); 117 action.sa_sigaction = &sigsegv_handler; 118 action.sa_flags = SA_SIGINFO; 119 sigaction (SIGSEGV, &action, (struct sigaction *) NULL); 120 sigaction (SIGBUS, &action, (struct sigaction *) NULL); 121 /* The first write access should invoke the handler and then complete. */ 122 crasher (page); 123 /* The second write access should not invoke the handler. */ 124 crasher (page); 125 /* Check that the handler was called only once. */ 126 if (handler_called != 1) 127 exit (1); 128 /* Test passed! */ 129 return 0; 130 } In normal mode the segfault happens here: │0x401102 <main+130> callq 0x401030 <sigaction@plt> │0x401107 <main+135> mov 0x2f4a(%rip),%rax # 0x404058 <page>
│0x40110e <main+142> movl $0x2a,0x678(%rax) │0x401118 <main+152> cmpl $0x1,0x2f35(%rip) │0x40111f <main+159> jne 0x401135 <main+181>
So the handler is called before the cmpl instruction. While with LTO we end up with: │0x40110b <main+139> cmpl $0x1,0x2f46(%rip) # 0x404058 <handler_called>
│0x401112 <main+146> movl $0x2a,0x678(%rbx) │0x40111c <main+156> jne 0x401133 <main+179> │0x40111e <main+158> add $0xa0,%rsp
So comparison of (handler_called != 1) happens before the store that triggers the handler. It's correct in my opinion to do such transformation. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c8 --- Comment #8 from Dr. Werner Fink <werner@suse.com> --- Hmm ... the transformation might be correcti(?), but how should libsigsegv catch the first segmentation fault (and only this), if the check is moved in the resulting assembler instructions out of the way? Would the volatile attribute for handler_called help here? Btw: IMHO we need an HOWTO for debugging link time optimized programs within gdb. This because with LTO the core dumps from users/customers become rather useless (IMHO). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c9 --- Comment #9 from Dr. Werner Fink <werner@suse.com> ---
From an old manual page of gcc (Leap 15.0)
Link-time optimization does not work well with generation of debugging information. Combining -flto with -g is currently experimental and expected to produce unexpected results. now I read on Tumbleweed: Link-time optimization does not work well with generation of debugging information on systems other than those using a combination of ELF and DWARF. may I asked what had beend changeds menawhile? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c10 --- Comment #10 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #8)
Hmm ... the transformation might be correcti(?), but how should libsigsegv catch the first segmentation fault (and only this), if the check is moved in the resulting assembler instructions out of the way? Would the volatile attribute for handler_called help here?
I would recommend to use a memory barrier: asm volatile("" ::: "memory"); More informations: https://stackoverflow.com/questions/14950614/working-of-asm-volatile-memory
Btw: IMHO we need an HOWTO for debugging link time optimized programs within gdb. This because with LTO the core dumps from users/customers become rather useless (IMHO).
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c11 Martin Liška <martin.liska@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |rguenther@suse.com --- Comment #11 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #9)
From an old manual page of gcc (Leap 15.0)
Link-time optimization does not work well with generation of debugging information. Combining -flto with -g is currently experimental and expected to produce unexpected results.
now I read on Tumbleweed:
Link-time optimization does not work well with generation of debugging information on systems other than those using a combination of ELF and DWARF.
Correct.
may I asked what had beend changeds menawhile?
In GCC 8 we made a significant improvement of the debug info: https://gcc.gnu.org/gcc-8/changes.html Link-time optimization improvements: We have significantly improved debug information on ELF targets using DWARF by properly preserving language-specific information. This allows for example the libstdc++ pretty-printers to work with LTO optimized executables. It's work made by mainly by Richard Biener and it's known as 'Early LTO'. Note that it was a significant effort and program backtraces as well as debugging experience in gdb should be comparable to non-LTO mode. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c12 --- Comment #12 from Dr. Werner Fink <werner@suse.com> --- (In reply to Martin Liška from comment #10)
(In reply to Dr. Werner Fink from comment #8)
Hmm ... the transformation might be correcti(?), but how should libsigsegv catch the first segmentation fault (and only this), if the check is moved in the resulting assembler instructions out of the way? Would the volatile attribute for handler_called help here?
I would recommend to use a memory barrier: asm volatile("" ::: "memory");
More informations: https://stackoverflow.com/questions/14950614/working-of-asm-volatile-memory
Btw: IMHO we need an HOWTO for debugging link time optimized programs within gdb. This because with LTO the core dumps from users/customers become rather useless (IMHO).
Ouch .. this is somehow a déjà vu for me as I had in past (15 years back) used memory barriers very often but had been told that this not need anymore with the modern gcc. IMHO it is a bug of the compiler if code is moved in such a way that the logic becomes broken. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c13 --- Comment #13 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #12)
(In reply to Martin Liška from comment #10)
(In reply to Dr. Werner Fink from comment #8)
Hmm ... the transformation might be correcti(?), but how should libsigsegv catch the first segmentation fault (and only this), if the check is moved in the resulting assembler instructions out of the way? Would the volatile attribute for handler_called help here?
I would recommend to use a memory barrier: asm volatile("" ::: "memory");
More informations: https://stackoverflow.com/questions/14950614/working-of-asm-volatile-memory
Btw: IMHO we need an HOWTO for debugging link time optimized programs within gdb. This because with LTO the core dumps from users/customers become rather useless (IMHO).
Ouch .. this is somehow a déjà vu for me as I had in past (15 years back) used memory barriers very often but had been told that this not need anymore with the modern gcc.
IMHO it is a bug of the compiler if code is moved in such a way that the logic becomes broken.
So that I created a GCC issue and let's be given a clarification: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=90245 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c14 --- Comment #14 from Dr. Werner Fink <werner@suse.com> --- Created attachment 803582 --> http://bugzilla.opensuse.org/attachment.cgi?id=803582&action=edit libsigsegv-2.12.dif this one seems to work around the moved assembler code as well as the error caused by the redefinition of the structures in <bits/sigcontext.h> by <asm/sigcontext.h> ... beside the redefinition, the move of the logic in the resulting assembler code seems to be somehow concerning at least for me ;) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c15 --- Comment #15 from Martin Liška <martin.liska@suse.com> --- Having the configure patch, would you like to enable LTO for the package? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c17 --- Comment #17 from Dr. Werner Fink <werner@suse.com> --- (In reply to Martin Liška from comment #15)
Having the configure patch, would you like to enable LTO for the package?
I had tried it and it builds ... but does it work in real life? In my experiences no one can trust a compiler which (re)moves logic from code. In fact in past I had used memory barriers in the body of for loops where the code was only in the boolean and incremental statement to avoid that those loops become removed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c18 --- Comment #18 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #17)
(In reply to Martin Liška from comment #15)
Having the configure patch, would you like to enable LTO for the package?
I had tried it and it builds ... but does it work in real life? In my experiences no one can trust a compiler which (re)moves logic from code.
We name it optimizing compiler :)
In fact in past I had used memory barriers in the body of for loops where the code was only in the boolean and incremental statement to avoid that those loops become removed.
Well you should probably use volatile for the variables which can be influenced by a handlers. Doing that the memory barious should not be needed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c20 --- Comment #20 from Dr. Werner Fink <werner@suse.com> --- (In reply to Martin Liška from comment #18)
In fact in past I had used memory barriers in the body of for loops where the code was only in the boolean and incremental statement to avoid that those loops become removed.
Well you should probably use volatile for the variables which can be influenced by a handlers. Doing that the memory barious should not be needed.
In fact using `volatile sig_atomic_t' does also work. Interesting question why the type sig_atomic_t is not defined with attribute volatile in <bits/types.h> -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c21 --- Comment #21 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #20)
(In reply to Martin Liška from comment #18)
In fact in past I had used memory barriers in the body of for loops where the code was only in the boolean and incremental statement to avoid that those loops become removed.
Well you should probably use volatile for the variables which can be influenced by a handlers. Doing that the memory barious should not be needed.
In fact using `volatile sig_atomic_t' does also work. Interesting question why the type sig_atomic_t is not defined with attribute volatile in <bits/types.h>
That's very good question, I asked the same! Can you please send a question to: https://sourceware.org/ml/libc-help/ mailing list? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c22 --- Comment #22 from Dr. Werner Fink <werner@suse.com> --- RPM lint does not like LTO in static libs [ 11s] RPMLINT report: [ 11s] =============== [ 12s] libsigsegv-devel.x86_64: W: lto-bytecode /usr/lib64/libsigsegv.a [ 12s] This executable contains a LTO section. LTO bytecode is not portable and [ 12s] should not be distributed in static libraries or e.g. Python modules. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c23 --- Comment #23 from Martin Liška <martin.liska@suse.com> --- (In reply to Dr. Werner Fink from comment #22)
RPM lint does not like LTO in static libs
[ 11s] RPMLINT report: [ 11s] =============== [ 12s] libsigsegv-devel.x86_64: W: lto-bytecode /usr/lib64/libsigsegv.a [ 12s] This executable contains a LTO section. LTO bytecode is not portable and [ 12s] should not be distributed in static libraries or e.g. Python modules.
Yep, I've added the check. Using fat LTO objects will handle this: https://en.opensuse.org/openSUSE:LTO#Static_libraries and brp-checks will do LTO bytecode stripping for the static libs. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1133245 http://bugzilla.opensuse.org/show_bug.cgi?id=1133245#c33 Dr. Werner Fink <werner@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED --- Comment #33 from Dr. Werner Fink <werner@suse.com> --- FIXED -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com