[Bug 1206321] New: livepatch test klp_tc_8.sh fails: rmmod: ERROR: Module klp_tc_8_4_livepatch is in use
https://bugzilla.suse.com/show_bug.cgi?id=1206321 Bug ID: 1206321 Summary: livepatch test klp_tc_8.sh fails: rmmod: ERROR: Module klp_tc_8_4_livepatch is in use Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: petr.vorel@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- This looks to be some race (does not fail always) [1]. It has been here at least since build 20221123 [2] which has 6.0.8-1.1 (1579d93) (likely before, but older failed logs cleared, it might be just on 6.0.x stable branch). ppc64le version looks to be OK [3]. I was not able to reproduce it on my laptop (6.1.0-rc8-2.g2fb1790-default), nor in my openQA instance [4] (might be related to o3 setup/infrastructure). Logs from openQA run (./klp_tc_8.sh) [5], nothing obvious in dmesg [6]. [19:26:53] Test Case 8: Patch with replace-all [19:26:53] *** Compiling live patches [19:26:53] make: Entering directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:54] CC [M] /tmp/live-patch/tc_8/patch_replace-all_1/klp_tc_8_1_livepatch.o [19:26:55] MODPOST /tmp/live-patch/tc_8/patch_replace-all_1/Module.symvers [19:26:55] CC [M] /tmp/live-patch/tc_8/patch_replace-all_1/klp_tc_8_1_livepatch.mod.o [19:26:55] LD [M] /tmp/live-patch/tc_8/patch_replace-all_1/klp_tc_8_1_livepatch.ko [19:26:55] BTF [M] /tmp/live-patch/tc_8/patch_replace-all_1/klp_tc_8_1_livepatch.ko [19:26:55] Skipping BTF generation for /tmp/live-patch/tc_8/patch_replace-all_1/klp_tc_8_1_livepatch.ko due to unavailability of vmlinux [19:26:55] make: Leaving directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:55] make: Entering directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:56] CC [M] /tmp/live-patch/tc_8/patch_replace-all_2/klp_tc_8_2_livepatch.o [19:26:56] MODPOST /tmp/live-patch/tc_8/patch_replace-all_2/Module.symvers [19:26:56] CC [M] /tmp/live-patch/tc_8/patch_replace-all_2/klp_tc_8_2_livepatch.mod.o [19:26:57] LD [M] /tmp/live-patch/tc_8/patch_replace-all_2/klp_tc_8_2_livepatch.ko [19:26:57] BTF [M] /tmp/live-patch/tc_8/patch_replace-all_2/klp_tc_8_2_livepatch.ko [19:26:57] Skipping BTF generation for /tmp/live-patch/tc_8/patch_replace-all_2/klp_tc_8_2_livepatch.ko due to unavailability of vmlinux [19:26:57] make: Leaving directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:57] make: Entering directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:58] CC [M] /tmp/live-patch/tc_8/patch_replace-all_3/klp_tc_8_3_livepatch.o [19:26:58] MODPOST /tmp/live-patch/tc_8/patch_replace-all_3/Module.symvers [19:26:58] CC [M] /tmp/live-patch/tc_8/patch_replace-all_3/klp_tc_8_3_livepatch.mod.o [19:26:58] LD [M] /tmp/live-patch/tc_8/patch_replace-all_3/klp_tc_8_3_livepatch.ko [19:26:58] BTF [M] /tmp/live-patch/tc_8/patch_replace-all_3/klp_tc_8_3_livepatch.ko [19:26:58] Skipping BTF generation for /tmp/live-patch/tc_8/patch_replace-all_3/klp_tc_8_3_livepatch.ko due to unavailability of vmlinux [19:26:58] make: Leaving directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:58] make: Entering directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:26:59] CC [M] /tmp/live-patch/tc_8/patch_replace-all_4/klp_tc_8_4_livepatch.o [19:27:00] MODPOST /tmp/live-patch/tc_8/patch_replace-all_4/Module.symvers [19:27:00] CC [M] /tmp/live-patch/tc_8/patch_replace-all_4/klp_tc_8_4_livepatch.mod.o [19:27:00] LD [M] /tmp/live-patch/tc_8/patch_replace-all_4/klp_tc_8_4_livepatch.ko [19:27:00] BTF [M] /tmp/live-patch/tc_8/patch_replace-all_4/klp_tc_8_4_livepatch.ko [19:27:00] Skipping BTF generation for /tmp/live-patch/tc_8/patch_replace-all_4/klp_tc_8_4_livepatch.ko due to unavailability of vmlinux [19:27:00] make: Leaving directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:27:00] make: Entering directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:27:01] CC [M] /tmp/live-patch/tc_8/patch_replace-all_5/klp_tc_8_5_livepatch.o [19:27:01] MODPOST /tmp/live-patch/tc_8/patch_replace-all_5/Module.symvers [19:27:01] CC [M] /tmp/live-patch/tc_8/patch_replace-all_5/klp_tc_8_5_livepatch.mod.o [19:27:01] LD [M] /tmp/live-patch/tc_8/patch_replace-all_5/klp_tc_8_5_livepatch.ko [19:27:01] BTF [M] /tmp/live-patch/tc_8/patch_replace-all_5/klp_tc_8_5_livepatch.ko [19:27:01] Skipping BTF generation for /tmp/live-patch/tc_8/patch_replace-all_5/klp_tc_8_5_livepatch.ko due to unavailability of vmlinux [19:27:01] make: Leaving directory '/usr/src/linux-6.0.12-1-obj/x86_64/default' [19:27:01] *** Inserting getpid patch 1 [19:27:02] *** Wait for completion (klp_tc_8_1_livepatch) [19:27:02] *** Inserting getpid patch 2 [19:27:02] *** Wait for completion (klp_tc_8_2_livepatch) [19:27:02] *** Inserting getpid patch 3 [19:27:02] *** Wait for completion (klp_tc_8_3_livepatch) [19:27:02] *** Inserting getpid patch 4 [19:27:02] *** Wait for completion (klp_tc_8_4_livepatch) [19:27:02] *** Inserting getpid patch 5 [19:27:02] *** Wait for completion (klp_tc_8_5_livepatch) [19:27:02] *** Removing getpid patch 1 [19:27:02] *** Removing getpid patch 2 [19:27:02] *** Removing getpid patch 3 [19:27:02] *** Removing getpid patch 4 [19:27:02] rmmod: ERROR: Module klp_tc_8_4_livepatch is in use [19:27:02] *** Removing patches [19:27:02] TEST FAILED while executing 'rmmod "$PATCH_MOD_NAME"' [1] https://openqa.opensuse.org/tests/2947725#next_previous [2] https://openqa.opensuse.org/tests/2899233 [3] https://openqa.opensuse.org/tests/2948060#next_previous [4] http://quasar.suse.cz/tests/1446#next_previous [5] https://openqa.opensuse.org/tests/2947725/file/serial_terminal.txt [6] https://openqa.opensuse.org/tests/2947725/file/serial0.txt -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c1 --- Comment #1 from Petr Vorel <petr.vorel@suse.com> --- Created attachment 863462 --> https://bugzilla.suse.com/attachment.cgi?id=863462&action=edit Test output (https://openqa.opensuse.org/tests/2947725/file/serial_terminal.txt) -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c2 --- Comment #2 from Petr Vorel <petr.vorel@suse.com> --- Created attachment 863463 --> https://bugzilla.suse.com/attachment.cgi?id=863463&action=edit dmesg (https://openqa.opensuse.org/tests/2947725/file/serial0.txt) -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 Petr Vorel <petr.vorel@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Found By|--- |openQA -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c3 Miroslav Bene�� <mbenes@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mpdesouza@suse.com, | |pmladek@suse.com --- Comment #3 from Miroslav Bene�� <mbenes@suse.com> --- Is it easily reproducible in openQA then? If yes, would it be possible to enable a dynamic debug output? # echo "file kernel/livepatch/* +p" > /sys/kernel/debug/dynamic_debug/control It might provide more insight. The logs do not contain anything interesting now. Given that klp_tc_8_4_livepatch is not removed, klp_tc_8_5_livepatch stays applied and is then applied to all the remaining tests (if relevant). -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c4 --- Comment #4 from Petr Vorel <petr.vorel@suse.com> --- (In reply to Miroslav Bene�� from comment #3)
Is it easily reproducible in openQA then? If yes, would it be possible to enable a dynamic debug output?
# echo "file kernel/livepatch/* +p" > /sys/kernel/debug/dynamic_debug/control
It might provide more insight.
Sure, I added the debugging to my fork. I reproduced it only once [1] of 6 runs [2]. Dmesg [3] contains more debug info in the end, but it also caused also test 14 to fail (only in this case where also test 8 fails). [1] https://openqa.opensuse.org/tests/2952626 [2] https://openqa.opensuse.org/tests/overview?build=livepatch-debugging&version=Tumbleweed&distri=opensuse [3] https://openqa.opensuse.org/tests/2952626/file/serial0.txt [4] https://openqa.opensuse.org/tests/2952626/file/serial_terminal.txt -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c5 --- Comment #5 from Petr Vorel <petr.vorel@suse.com> --- Created attachment 863504 --> https://bugzilla.suse.com/attachment.cgi?id=863504&action=edit dmesg with debugging (https://openqa.opensuse.org/tests/2952626/file/serial0.txt) -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c6 --- Comment #6 from Petr Vorel <petr.vorel@suse.com> --- Created attachment 863505 --> https://bugzilla.suse.com/attachment.cgi?id=863505&action=edit Test output with debugging (https://openqa.opensuse.org/tests/2952626/file/serial_terminal.txt) -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c7 --- Comment #7 from Miroslav Bene�� <mbenes@suse.com> --- There is something strange going on. klp_tc_8_{1,2,3}_livepatch modules are just rmmoded. Their refcount is 0, so it is straightforward. klp_tc_8_4_livepatch should be the same. They are all replace_all patches so once klp_tc_8_5_livepatch is installed, and it is successfully, the refcount of klp_tc_8_4_livepatch should be 0 and it should be possible to just rmmod it. But it is clearly not for some reason. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c8 --- Comment #8 from Miroslav Bene�� <mbenes@suse.com> --- I am not able to reproduce on v6.1, but... There is a race condition. The kernel infrastructure clears klp_transition_patch once a transition succeeds. And only after that it calls module_put() on all replaced previous live patches if klp_transition_patch was replace_all. klp_in_progress() in qa_test_klp checks /sys/kernel/livepatch/*/transition which is basically an export of klp_transition_patch. klp_wait_complete() then waits for klp_in_progress() to return false when a live patch module is loaded and that is it. So what very likely happens in klp_tc_8.sh is that klp_tc_8_*_livepatch are loaded sequentially. When it is klp_tc_8_5_livepatch turn, the module is loaded successfully and klp_transition_patch is cleared. klp_wait_complete() sees it and tries to rmmod all previous modules because their refcnt should be zero. But that does not have to be true for klp_tc_8_4_livepatch because its module_put() has not been called yet in the kernel. Let's just call klp_wait_complete() also before rmmod in klp_tc_8.sh. It has the logic to wait for refcnt being zero anyway (used in klp_tc_exit() for example, which does the right thing). Petr, could you test with the attached patch for qa_test_klp, please? -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 https://bugzilla.suse.com/show_bug.cgi?id=1206321#c9 --- Comment #9 from Miroslav Bene�� <mbenes@suse.com> --- Created attachment 863525 --> https://bugzilla.suse.com/attachment.cgi?id=863525&action=edit Fix for klp_tc_8.sh -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1206321 Miroslav Bene�� <mbenes@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|kernel-bugs@opensuse.org |mbenes@suse.com -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@suse.com