[Bug 1204607] New: Version 1.9.1-2.1 of irqbalance generates journal error
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 Bug ID: 1204607 Summary: Version 1.9.1-2.1 of irqbalance generates journal error Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Other Assignee: screening-team-bugs@suse.de Reporter: genes1122@gmail.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- The update to irqbalancd-1.9.1-2.1 produced a journalctl error and a coredump at boot.
journalctl -b -p err Oct 23 09:31:50 Mobile-PC /usr/sbin/irqbalance[916]: thermal: socket bind failed.
coredumpctl | grep irqbalance Sat 2022-10-22 12:04:42 PDT 994 0 0 SIGABRT inaccessible /usr/sbin/irqbalance n/a Sun 2022-10-23 09:32:32 PDT 916 0 0 SIGABRT inaccessible /usr/sbin/irqbalance n/a
This occurs for kernel 6.0 and kernel 6.1. Thanks, Gene -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 Gene Snider <genes1122@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c1 Andreas Stieger <Andreas.Stieger@gmx.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Priority|P3 - Medium |P5 - None CC| |dmueller@suse.com, | |josef.moellers@suse.com, | |trenn@suse.com Assignee|screening-team-bugs@suse.de |josef.moellers@suse.com --- Comment #1 from Andreas Stieger <Andreas.Stieger@gmx.de> --- SR#1029930 enabled thermald support -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c2 --- Comment #2 from Gene Snider <genes1122@gmail.com> --- Does that mean irqbalance is broken? My understanding is that AMD CPUs don't support thermald. What is the status of this bug report? Thanks, Gene -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c4 --- Comment #4 from Gene Snider <genes1122@gmail.com> --- Request 1: Since I have the old rpm, I diffed the outputs for brevity. Old version: # zypper -q se -sx irqbalance.x86_64 S | Name | Type | Version | Arch | Repository ---+------------+---------+-----------+--------+---------------------- i+ | irqbalance | package | 1.9.1-1.1 | x86_64 | (System Packages) v | irqbalance | package | 1.9.1-2.1 | x86_64 | Main Repository (OSS) # irqbalance -d --foreground > irqbalance.old New version: # zypper -q se -sx irqbalance.x86_64 S | Name | Type | Version | Arch | Repository ---+------------+---------+-----------+--------+---------------------- i+ | irqbalance | package | 1.9.1-2.1 | x86_64 | Main Repository (OSS) # irqbalance -d --foreground > irqbalance.new Differences: # diff -s irqbalance.old irqbalance.new 2a3
Prevent irq assignment to these thermal-banned CPUs: 00000000 94a96 thermal: received group id (3).
The first message looks reasonable for a CPU that doesn't support thermald, I'm not sure about the second message. However, the message in the journal is confusing. Oct 24 08:35:16 Mobile-PC /usr/sbin/irqbalance[3546]: thermal: socket bind failed. It sounds like something went wrong, when in fact, the action was normal for a non-Intel CPU. Request 2: Results of gdb backtrace for the irqbalance coredump: # coredumpctl -1 debug PID: 943 (irqbalance) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Mon 2022-10-24 08:49:03 PDT (1h 33min ago) Command Line: /usr/sbin/irqbalance --foreground Executable: /usr/sbin/irqbalance Control Group: /system.slice/irqbalance.service Unit: irqbalance.service Slice: system.slice Boot ID: 1fe3c0c81e464baab607f078ceddadd1 Machine ID: 67007840ea464a92b139e093c5491dfd Hostname: Mobile-PC Storage: /var/lib/systemd/coredump/core.irqbalance.0.1fe3c0c81e464baab607f078ceddadd1.943.1666626543000000.zst (present) Disk Size: 58.6K Message: Process 943 (irqbalance) of user 0 dumped core. #0 0x00007f9d49c3980c in __pthread_kill_implementation () from /lib64/libc.so.6 [Current thread is 1 (Thread 0x7f9d49af2780 (LWP 943))] Missing separate debuginfos, use: zypper install irqbalance-debuginfo-1.9.1-2.1.x86_64 ... (gdb) bt #0 0x00007f9d49c3980c in __pthread_kill_implementation () from /lib64/libc.so.6 #1 0x00007f9d49be6846 in raise () from /lib64/libc.so.6 #2 0x00007f9d49bcf81c in abort () from /lib64/libc.so.6 #3 0x00007f9d49c2c9ae in __libc_message () from /lib64/libc.so.6 #4 0x00007f9d49c4414c in malloc_printerr () from /lib64/libc.so.6 #5 0x00007f9d49c46626 in _int_free () from /lib64/libc.so.6 #6 0x00007f9d49c48b13 in free () from /lib64/libc.so.6 #7 0x0000564f3cfa6f41 in ?? () #8 0x00007f9d49bd05b0 in __libc_start_call_main () from /lib64/libc.so.6 #9 0x00007f9d49bd0679 in __libc_start_main_impl () from /lib64/libc.so.6 #10 0x0000564f3cfa81a5 in ?? () (gdb) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c8 --- Comment #8 from Gene Snider <genes1122@gmail.com> --- I will install the debug package ASAP and run the back trace again. For now, this SIGABORT occurs at boot. The CPU is a Zen 3 AMD Ryzen 7 5825U with Radeon Graphics and sleep is broken. Gene -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c9 --- Comment #9 from Gene Snider <genes1122@gmail.com> --- I installed all the debug packages requested by gdb and got this result: # coredumpctl -1 debug PID: 943 (irqbalance) UID: 0 (root) GID: 0 (root) Signal: 6 (ABRT) Timestamp: Mon 2022-10-24 08:49:03 PDT (1 day 1h ago) Command Line: /usr/sbin/irqbalance --foreground Executable: /usr/sbin/irqbalance Control Group: /system.slice/irqbalance.service Unit: irqbalance.service Slice: system.slice Boot ID: 1fe3c0c81e464baab607f078ceddadd1 Machine ID: 67007840ea464a92b139e093c5491dfd Hostname: Mobile-PC Storage: /var/lib/systemd/coredump/core.irqbalance.0.1fe3c0c81e464baab607f078ceddadd1.943.1666626543000000.zst (present) Disk Size: 58.6K Message: Process 943 (irqbalance) of user 0 dumped core. ... Core was generated by `/usr/sbin/irqbalance --foreground'. Program terminated with signal SIGABRT, Aborted. warning: Section `.reg-xstate/943' in core file too small. #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 44 pthread_kill.c: No such file or directory. [Current thread is 1 (Thread 0x7f9d49af2780 (LWP 943))] (gdb) bt #0 __pthread_kill_implementation (threadid=<optimized out>, signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44 #1 0x00007f9d49c39893 in __pthread_kill_internal (signo=6, threadid=<optimized out>) at pthread_kill.c:78 #2 0x00007f9d49be6846 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26 #3 0x00007f9d49bcf81c in __GI_abort () at abort.c:79 #4 0x00007f9d49c2c9ae in __libc_message (action=action@entry=do_abort, fmt=fmt@entry=0x7f9d49d5544f "%s\n") at ../sysdeps/posix/libc_fatal.c:155 #5 0x00007f9d49c4414c in malloc_printerr ( str=str@entry=0x7f9d49d58088 "free(): double free detected in tcache 2") at malloc.c:5660 #6 0x00007f9d49c46626 in _int_free (av=0x7f9d49d8ec60 <main_arena>, p=0x564f3e856190, have_lock=have_lock@entry=0) at malloc.c:4469 #7 0x00007f9d49c48b13 in __GI___libc_free (mem=<optimized out>) at malloc.c:3385 #8 0x00007f9d49e99938 in nl_socket_free (sk=<optimized out>) at lib/socket.c:250 #9 0x0000564f3cfacea0 in deinit_thermal () at /usr/src/debug/irqbalance-1.9.1/thermal.c:510 #10 0x0000564f3cfa6f41 in main (argc=<optimized out>, argv=<optimized out>) at /usr/src/debug/irqbalance-1.9.1/irqbalance.c:722 (gdb) Gene -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c12 --- Comment #12 from OBSbugzilla Bot <bwiedemann+obsbugzillabot@suse.com> --- This is an autogenerated message for OBS integration: This bug (1204607) was mentioned in https://build.opensuse.org/request/show/1031212 Factory / irqbalance -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1204607 http://bugzilla.opensuse.org/show_bug.cgi?id=1204607#c13 --- Comment #13 from Gene Snider <genes1122@gmail.com> --- The double free core dump appears to be fixed by irqbalance-1.9.1-244.1. Hopefully it works for Michael Hirmke as well. Thanks!! Gene -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com