On Thu, 2012-05-31 at 08:50 +0200, Roger Oberholtzer wrote:
The symptom I see is that when I use one of these libraries, I then get a segmentation violation in libz. And I cannot see any problem with my call to gzwrite(). I am guessing that some part of the zlib data structure for the specific file has been corrupted by someone. The failure is always at the same point: a SIGPOLL signal results in a signal handler reading from a GPS, and the read data is sent to gzwrite(). The reads/writes are properly serialized. The signals do not interrupt themselves. When I add the thread library to the application so that there are threads reading from GigE Vision cameras, this problem can occur.
I should add this to my original question in this thread: If a process has signal handlers, how does this effect the core file? Meaning that if a seg violation occurs, and then a signal arrives, could the signal handler somehow confuse what I see in gdb? In my case, it looks like this: #0 0xb5df0c23 in ?? () from /lib/libz.so.1 #1 0xb5df0ff8 in deflate () from /lib/libz.so.1 #2 0xb5dee674 in gzwrite () from /lib/libz.so.1 #3 0x0808723d in GpsDataHandler (info=0x8329fd8) at ../gps.c:2430 #4 0xb5bbb60d in SIGPOLLhandler (sig=29) at ../aim.c:129 #5 <signal handler called> #6 0xffffe424 in __kernel_vsyscall () #7 0xb5d1d6a1 in select () from /lib/libc.so.6 #8 0xb71fae66 in Tcl_WaitForEvent () from /usr/lib/libtcl8.5.so #9 0xb71c2cdb in Tcl_DoOneEvent () from /usr/lib/libtcl8.5.so #10 0x08065687 in TheRealMeasurementLoop () at ../measure.c:223 #11 DoMeasurement () at ../measure.c:310 #12 0x08062aea in DoHiway (interp=0x81879c8) at ../hiway.c:684 #13 0xb7025687 in Tk_MainEx () from /usr/lib/libtk8.5.so #14 0x080609fe in main (argc=0, argv=0xbfa2af14, envp=0x817c160) at ../hiway.c:164 (I have newer traces that tell the line in libz that fails.) In fact, the SIGPOLL is happening for other sources as well. For example, a UDP port. And that on is happening much more often than the GPS one. So if it was a code artifact that I saw the SIGPOLL, I would have expected it to be the one that happens much more often. As that is never the case, I have come to the conclusion that the seg violation is indeed in the signal handler. Could my assumption be wrong? Yours sincerely, Roger Oberholtzer OPQ Systems / Ramböll RST Office: Int +46 10-615 60 20 Mobile: Int +46 70-815 1696 roger.oberholtzer@ramboll.se ________________________________________ Ramböll Sverige AB Krukmakargatan 21 P.O. Box 17009 SE-104 62 Stockholm, Sweden www.rambollrst.se -- To unsubscribe, e-mail: opensuse-programming+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-programming+owner@opensuse.org