[Bug 1198668] New: Tumbleweed: Purge old kernels service block during boot

https://bugzilla.suse.com/show_bug.cgi?id=1198668 Bug ID: 1198668 Summary: Tumbleweed: Purge old kernels service block during boot Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: screening-team-bugs@suse.de Reporter: petr.vorel@suse.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Sometimes "Purge old kernels" service does not run after booting, but during boot, which slows down boot: $ LC_ALL=C journalctl -b 0 |grep -i purge Apr 20 08:16:01 dell5510 systemd[1]: Starting Purge old kernels... Apr 20 08:16:02 dell5510 zypper[1546]: Preparing to purge obsolete kernels... Apr 20 08:17:09 dell5510 systemd[1]: purge-kernels.service: Deactivated successfully. Apr 20 08:17:09 dell5510 systemd[1]: Finished Purge old kernels. Apr 20 08:17:09 dell5510 systemd[1]: purge-kernels.service: Consumed 53.594s CPU time. It started few weeks ago, it does not happen every boot, but maybe once a week. But bothered me enough to report the problem. $ LC_ALL=C rpm -qi purge-kernels-service Name : purge-kernels-service Version : 0 Release : 8.4 Architecture: noarch Install Date: Tue Feb 22 11:51:00 2022 Group : Unspecified Size : 346 License : MIT Signature : RSA/SHA256, Sat Feb 19 22:50:00 2022, Key ID b88b2fd43dbdc284 Source RPM : purge-kernels-service-0-8.4.src.rpm Build Date : Sat Feb 19 22:37:41 2022 ... $ lsb_release -a LSB Version: n/a Distributor ID: openSUSE Description: openSUSE Tumbleweed Release: 20220415 Codename: n/a -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 Chenzi Cao <chcao@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|screening-team-bugs@suse.de |msuchanek@suse.com -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c1 --- Comment #1 from Michal Suchanek <msuchanek@suse.com> --- It has dependencies only on fs so it can run relatively early Description=Purge old kernels After=local-fs.target ConditionPathExists=/boot/do_purge_kernels ConditionPathIsReadWrite=/ What does 'slows down boot' mean, specifically? -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c6 Yiannis Bonatakis <ioannis.bonatakis@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ioannis.bonatakis@suse.com --- Comment #6 from Yiannis Bonatakis <ioannis.bonatakis@suse.com> --- takes <10m on my laptop. which is kinda annoying to wait for so long for the booting. ``` ��� sudo systemd-analyze blame 13min 21.087s purge-kernels.service 13min 4.007s plymouth-quit-wait.service ``` and journal reports ``` ��� sudo journalctl -u purge-kernels Apr 27 09:52:11 anais systemd[1]: Starting Purge old kernels... Apr 27 09:52:46 anais.suse.cz zypper[1442]: Reading installed packages... Apr 27 09:52:47 anais.suse.cz zypper[1442]: Preparing to purge obsolete kernels... Apr 27 09:52:47 anais.suse.cz zypper[1442]: Configuration: latest,latest-1,running Apr 27 09:52:47 anais.suse.cz zypper[1442]: Running kernel release: 5.17.3-1-default Apr 27 09:52:47 anais.suse.cz zypper[1442]: Running kernel arch: x86_64 Apr 27 09:52:47 anais.suse.cz zypper[1442]: Resolving package dependencies... Apr 27 09:52:47 anais.suse.cz zypper[1442]: The following 4 packages are going to be REMOVED: Apr 27 09:52:47 anais.suse.cz zypper[1442]: bbswitch-kmp-default-0.8_k5.16.15_1-11.60 bbswitch-kmp-default-0.8_k5.17.1_1-11.61 kernel-default-5.16.15-1.1 kernel-default-5.17.1-1.2 Apr 27 09:52:47 anais.suse.cz zypper[1442]: 4 packages to remove. Apr 27 09:52:47 anais.suse.cz zypper[1442]: After the operation, 499.7 MiB will be freed. Apr 27 09:52:47 anais.suse.cz zypper[1442]: Continue? [y/n/v/...? shows all options] (y): y Apr 27 09:53:52 anais.suse.cz [RPM][2787]: Transaction ID 6268f690 started Apr 27 09:53:59 anais.suse.cz [RPM][2787]: erase bbswitch-kmp-default-0.8_k5.16.15_1-11.60.x86_64: success Apr 27 09:59:11 anais.suse.cz [RPM][2787]: erase bbswitch-kmp-default-0.8_k5.16.15_1-11.60.x86_64: success Apr 27 09:59:11 anais.suse.cz [RPM][2787]: Transaction ID 6268f690 finished: 0 Apr 27 09:59:18 anais.suse.cz zypper[1442]: (1/4) Removing bbswitch-kmp-default-0.8_k5.16.15_1-11.60.x86_64 [.....done] Apr 27 09:59:18 anais.suse.cz [RPM][11672]: Transaction ID 6268f7d6 started Apr 27 09:59:18 anais.suse.cz [RPM][11672]: erase bbswitch-kmp-default-0.8_k5.17.1_1-11.61.x86_64: success Apr 27 10:04:18 anais.suse.cz [RPM][11672]: erase bbswitch-kmp-default-0.8_k5.17.1_1-11.61.x86_64: success Apr 27 10:04:18 anais.suse.cz [RPM][11672]: Transaction ID 6268f7d6 finished: 0 Apr 27 10:04:22 anais.suse.cz zypper[1442]: (2/4) Removing bbswitch-kmp-default-0.8_k5.17.1_1-11.61.x86_64 [.....done] Apr 27 10:04:22 anais.suse.cz [RPM][20508]: Transaction ID 6268f906 started Apr 27 10:04:25 anais.suse.cz [RPM][20508]: erase kernel-default-5.16.15-1.1.x86_64: success Apr 27 10:04:44 anais.suse.cz [RPM][20508]: erase kernel-default-5.16.15-1.1.x86_64: success Apr 27 10:04:44 anais.suse.cz [RPM][20508]: Transaction ID 6268f906 finished: 0 Apr 27 10:04:58 anais.suse.cz zypper[1442]: (3/4) Removing kernel-default-5.16.15-1.1.x86_64 [.....done] Apr 27 10:04:58 anais.suse.cz [RPM][21638]: Transaction ID 6268f92a started Apr 27 10:04:58 anais.suse.cz [RPM][21638]: erase kernel-default-5.17.1-1.2.x86_64: success Apr 27 10:05:10 anais.suse.cz macosx-prober[22483]: debug: /dev/sda2 is not an HFS+ partition: exiting Apr 27 10:05:13 anais.suse.cz [RPM][21638]: erase kernel-default-5.17.1-1.2.x86_64: success Apr 27 10:05:13 anais.suse.cz [RPM][21638]: Transaction ID 6268f92a finished: 0 Apr 27 10:05:21 anais.suse.cz zypper[1442]: (4/4) Removing kernel-default-5.17.1-1.2.x86_64 [.....done] Apr 27 10:05:30 anais.suse.cz zypper[1442]: There are running programs which still use files and libraries deleted or updated by recent upgrades. They should be restarted to benefit from the latest updates. Run 'zypper ps -s' to list th> Apr 27 10:05:30 anais.suse.cz zypper[1442]: Apr 27 10:05:30 anais.suse.cz systemd[1]: purge-kernels.service: Deactivated successfully. Apr 27 10:05:30 anais.suse.cz systemd[1]: Finished Purge old kernels. Apr 27 10:05:30 anais.suse.cz systemd[1]: purge-kernels.service: Consumed 2min 12.374s CPU time. ``` -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c7 --- Comment #7 from Petr Vorel <petr.vorel@suse.com> --- (In reply to Michal Suchanek from comment #4)
Why is it waiting?
I'll try to debug whether starting purge-kernels.service somehow depend on the delay between user input for decrypting /home. Obviously it'd be great if it depend on GUI already started (if GUI enabled). -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c8 --- Comment #8 from Michal Suchanek <msuchanek@suse.com> --- It reports that purge-kernels and some 'waiting for boot to finish' is running. So probably that 'waiting for boot to finish' waits for purge-kernels but I have no idea how that dependency is created. -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c9 --- Comment #9 from Michal Suchanek <msuchanek@suse.com> --- the other running service is likely /usr/lib/systemd/system/plymouth-quit-wait.service -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c10 Michal Suchanek <msuchanek@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |systemd-maintainers@suse.de Flags| |needinfo?(systemd-maintaine | |rs@suse.de) --- Comment #10 from Michal Suchanek <msuchanek@suse.com> --- Please advise how to diagnose why the desktop cannot be displyed while purge-kernels.service is running. -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c11 Michal Koutn� <mkoutny@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |mkoutny@suse.com --- Comment #11 from Michal Koutn� <mkoutny@suse.com> --- Quick check would be
systemctl show -p Before purge-kernels.service (possibly traverse further until you reach systemd-user-sessions.service or default.target (graphical.target))
Alternatively, you may find the ordering path(s) in a plot
systemd-analyze dot --order | dot -Tsvg >systemd.svg (warning: kind of firehose drinking)
Thirdly, I can see the purge-kernels.service is of Type=oneshot, which means it'll block subsequent jobs until the ExecStart= command finishes. Not sure if there's actually necessity to wait after kernel removal is done, the Type=simple or Type=exec would only wait for the command start and the removal may run in parallel (even with jobs that specify After=purge-kernels.service). Additionally, there may be extra-systemd implicit dependency (such as the zypper lock) that prevents boot progression. -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c12 --- Comment #12 from Michal Suchanek <msuchanek@suse.com> --- So let's start with the simple part: https://build.opensuse.org/request/show/977061 -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c13 Franck Bui <fbui@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fbui@suse.com Flags|needinfo?(systemd-maintaine | |rs@suse.de) | --- Comment #13 from Franck Bui <fbui@suse.com> --- I don't think Type=exec is the correct fix here. If there's anything wrong with purge-kernels.service ordering constraints then they should be fixed instead of tweaking the type of the service itself. You can start by inspecting the ordering constraints of purge-kernels.service with `systemctl list-dependencies --before --all purge-kernels.service` for example. Or maybe add an additional "ExecStartPre=/usr/bin/sleep 60" in the service to see if the problem is really an ordering issue. Maybe it's more related to slow IOs or such. If that's the case, you might consider purging the kernel regularly during runtime (with a timer unit and with limited IO resources for the service) instead of doing that during the boot process. -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c15 --- Comment #15 from Michal Koutn� <mkoutny@suse.com> --- (In reply to Tejas Guruswamy from comment #14)
systemctl show -p Before purge-kernels.service Before=shutdown.target YaST2-Second-Stage.service multi-user.target YaST2-Firstboot.service
I'm not sure purge-kernels.service:Before=multi-user.target would cause the 'waiting for boot to finish' (actually is it a verbatim message? I can't grep it in sources). (The YaST2* service may cause it but it's not what the original report was about AFAICT.)
Because of the '[Install] WantedBy=multi-user.target', it is getting an implicit Before=multi-user.target.
Is After=multi-user.target ok to fix this? Does it even need an [Install] section at all?
You need some other unit to pull-in purge-kernels.service into the boot transaction. (In reply to Franck Bui from comment #13)
I don't think Type=exec is the correct fix here. If there's anything wrong with purge-kernels.service ordering constraints then they should be fixed instead of tweaking the type of the service itself.
It IMO depends whether the mere launch of the service is important or someone relies on old kernels being truly gone (i.e. wait for purge-kernels to finish).
Maybe it's more related to slow IOs or such.
Yes, in that case the Type=exec workaround would prove ineffective. So I wonder when we can get feedback on the maint-request. -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c16 --- Comment #16 from Michal Suchanek <msuchanek@suse.com> --- (In reply to Michal Koutn� from comment #15)
(In reply to Tejas Guruswamy from comment #14)
systemctl show -p Before purge-kernels.service Before=shutdown.target YaST2-Second-Stage.service multi-user.target YaST2-Firstboot.service
I'm not sure purge-kernels.service:Before=multi-user.target would cause the 'waiting for boot to finish' (actually is it a verbatim message? I can't grep it in sources).
AFAICT it's /usr/lib/systemd/system/plymouth-quit-wait.service -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 Michal Suchanek <msuchanek@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |petr.vorel@suse.com Flags| |needinfo?(petr.vorel@suse.c | |om) -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c19 Petr Vorel <petr.vorel@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |FIXED Flags|needinfo?(petr.vorel@suse.c | |om) | --- Comment #19 from Petr Vorel <petr.vorel@suse.com> --- I haven't noticed any problem for quite long time. I guess the change service type to exec solved the problem, thus closing https://build.opensuse.org/request/show/977061 -- You are receiving this mail because: You are on the CC list for the bug.

https://bugzilla.suse.com/show_bug.cgi?id=1198668 https://bugzilla.suse.com/show_bug.cgi?id=1198668#c22 --- Comment #22 from Maintenance Automation <maint-coord+maintenance-robot@suse.de> --- SUSE-RU-2023:0793-1: An update that has one recommended fix can now be installed. Category: recommended (moderate) Bug References: 1198668 Sources used: openSUSE Leap 15.4 (src): purge-kernels-service-0-150200.8.6.1 Basesystem Module 15-SP4 (src): purge-kernels-service-0-150200.8.6.1 SUSE Linux Enterprise Real Time 15 SP3 (src): purge-kernels-service-0-150200.8.6.1 NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com