[Bug 1183739] New: Kernel update with a release minor number bump causes old kernel removal
http://bugzilla.opensuse.org/show_bug.cgi?id=1183739 Bug ID: 1183739 Summary: Kernel update with a release minor number bump causes old kernel removal Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: kerossin@pm.me QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- When a kernel update is released that only bumps up the last number of the release the older kernel is removed and the new one installed instead of just installing the new.
rpm -qi kernel-default-5.11.4
Version : 5.11.4 Release : 1.3 <- this is the number I'm talking about
Perhaps a real example will makes this clearer. Recently Tumbleweed was on kernel 5.11.4-1.2, snapshot 20210315 had an update - 5.11.4-1.3. Zypper handles this update by removing -1.2 first and then installing -1.3. With the current configuration of kernel packages it's not possible to install both of these packages side by side because they use the exact same file and directory names, the minor release number is omitted, e.g. /boot/initrd-5.11.4-1-default, /lib/modules/5.11.4-1-default/. What problems does this cause: Well a fundamental one would be that it doesn't follow the convention and what everyone expects - a kernel update shouldn't touch the previous kernel and should be installed separately. Another one and how we noticed this issue is that it can mess with out of tree kernel modules during updates. We specifically noticed this with Nvidia driver kernel modules and it depends on how lucky you are. For example when 'zypper dup' installs kernel-default before kernel-default-devel you're in luck, no problem happens. But if you get unlucky and zypper installs kernel-default-devel before kernel-default that's when the problem occurs. Nvidia kernel module installation is triggered by kernel-default-devel installation and those modules are put into /lib/modules/$kernelversion/updates/. The removal of those modules is triggered by removal of kernel-default. So here's what happens with the 20210315 unlucky example: 1) kernel-default-devel-5.11.4-1.3 is installed 2) Nvidia modules are installed at /lib/modules/5.11.4-1-default/updates/ 3) kernel-default-5.11.4-1.2 is removed 4) Nvidia modules are deleted from /lib/modules/5.11.4-1-default/updates/ 5) kernel-default-5.11.4-1.3 is installed now your new kernel is left without Nvidia drivers and it will boot to the terminal. If it would use file and directory names with the full release number (/lib/modules/5.11.4-1.2-default/, /lib/modules/5.11.4-1.3-default/) this wouldn't happen and also the removal of the kernel wouldn't be nescessary. The way I see it this potentially affects not only Nvidia drivers, if other kernel modules are managed in a similar fashion they could be affected. *Note: this same thing occured with snapshot 20210222 when kernel went from 5.10.16-1.2 to 5.10.16-1.3. Solutions: Is it possible to use the full release number in file and directory names so that these kernels would be installed alongside old kernels? Or perhaps an easier solution is to not increment the minor release number and just the major e.g. 5.11.4-1, 5.11.4-2, 5.11.4-3, etc. ? -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@suse.com