[Bug 1191345] New: opendkim segv in libunbound8
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345 Bug ID: 1191345 Summary: opendkim segv in libunbound8 Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: screening-team-bugs@suse.de Reporter: patrick.schaaf@yalwa.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I run two loadbalanced postfix MX servers, both kept kind of up-to-date with tumbleweed. Part of that install, on both of them, is opendkim for mail signing. Setup basically has been running flawlessly for a few years. One system was updated about 40 days ago, to tumbleweed 20210810, which has libunbound8-1.13.1-2.1.x86_64 Today I updated the other one to even newer tumbleweed, 20210929, which brought newerlibunbound8-1.13.2-1.2.x86_64 package. And there, I experienced opendkim hitting a segv, after ~1500 seconds of uptime, restarted it manually, and it ran into a second one at ~3600 seconds of system uptime. The dmesg output, see below, pointed at libunbound8. So I manually force-downgraded just that package to the version that has been running on the partner machine for some days - and the segvs are gone now after several hours. Unfortunately, these are busy mailservers with multiple mails flowing through each second, so I cannot pinpoint / reproduce the segv precisely, and I cannot risk much experimenting on them. I hope this observations can help resolve the issue, anyway. So here's the dmesg output I saw: 2021-10-05T14:02:56.385231+02:00 phobos kernel: [ 1521.425408] opendkim[2552]: segfault at 10 ip 00007fd1f33256aa sp 00007fd1eb7fd640 error 4 in libunbound.so.8. 1.13[7fd1f328c000+cb000] 2021-10-05T14:02:56.385242+02:00 phobos kernel: [ 1521.425422] Code: fc 55 48 89 f5 53 48 83 ec 28 48 8b 9e 20 01 00 00 48 8b 43 10 4c 8b 38 4c 8b 50 08 48 8b 86 30 01 00 00 48 8b 80 b8 00 00 00 <4c> 8b 58 10 48 c7 86 30 01 00 00 00 00 00 00 83 fa fe 0f 84 46 02 ... 2021-10-05T14:38:48.875262+02:00 phobos kernel: [ 3674.051115] opendkim[11676]: segfault at 10 ip 00007f96636696aa sp 00007f965b7fd640 error 4 in libunbound.so.8.1.13[7f96635d0000+cb000] 2021-10-05T14:38:48.875273+02:00 phobos kernel: [ 3674.051128] Code: fc 55 48 89 f5 53 48 83 ec 28 48 8b 9e 20 01 00 00 48 8b 43 10 4c 8b 38 4c 8b 50 08 48 8b 86 30 01 00 00 48 8b 80 b8 00 00 00 <4c> 8b 58 10 48 c7 86 30 01 00 00 00 00 00 00 83 fa fe 0f 84 46 02 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c1
Togan Muftuoglu
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c2
Ferdinand Thiessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
Michael Str�der
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c3
--- Comment #3 from Togan Muftuoglu
For me it looks like an issue within the new version of the unbound library
As a workaround I have defined local unbound as Nameserver and it seems to hold so far. Nameservers 127.0.0.1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c6
--- Comment #6 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c7
--- Comment #7 from Ferdinand Thiessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c8
--- Comment #8 from Ferdinand Thiessen
Today I got around to test for this bug / crash another time.
For both tw VERSION_ID 20211120 (libunbound8-1.13.2-2.1), and VERSION_ID 20211220 (libunbound8-1.14.0-1.1), I could see the segv.
Can you try my patched version, does the crash still occurs? https://download.opensuse.org/repositories/home:/susnux:/branches:/server:/d... -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c9
Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c10
--- Comment #10 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c11
--- Comment #11 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c12
--- Comment #12 from Ferdinand Thiessen
Would be nice if this could be fixed for real.
Waiting for a maintainer to accept the SR: https://build.opensuse.org/request/show/948954 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c13
--- Comment #13 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c14
--- Comment #14 from Ferdinand Thiessen
Another test today with current tw libunbound8-1.14.0-1.3.x86_64.rpm - same coredumping as seen before; switched again to Ferdinand's test rpm, which continues to work.
Still no response of the package maintainer, I tried to contacted the project maintainers for the submit request. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
Aaron Puchert
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c15
--- Comment #15 from Patrick Schaaf
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c16
Ferdinand Thiessen
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c17
--- Comment #17 from Togan Muftuoglu
Tested this here once more at current tw snapshot 20220516, with libunbound8-1.15.0-1.2 package - and after 6 hours of running, I'm happy to see no coredumps.
So, this ticket is probably good to be closed.
My experience has been with the bug, it always happens when a bad dkim signature arrives then it is triggered. I had days with no issues with the previous versions and suddenly it would crash. So in my opionion 6 hrs is a bit enthusiastic approach but YMMV. Rather than opting to close the bug I would rather wait something like 72hrs than proceed. Of course reopening the bug or creating a new bug is always possible but then why close this one in a rush state could be raised as a question. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345
http://bugzilla.opensuse.org/show_bug.cgi?id=1191345#c18
--- Comment #18 from Ferdinand Thiessen
So in my opionion 6 hrs is a bit enthusiastic approach but YMMV. Rather than opting to close the bug I would rather wait something like 72hrs than proceed.
The origin of the bug was identified in libunbound and fixed with the latest release so this bug can be fixed (see submit request to factory above). -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com