[Bug 886673] New: boot partition gets hosed filled to the last byte or sometimes corrupted before actual kernel update succeeds
https://bugzilla.novell.com/show_bug.cgi?id=886673 https://bugzilla.novell.com/show_bug.cgi?id=886673#c0 Summary: boot partition gets hosed filled to the last byte or sometimes corrupted before actual kernel update succeeds Classification: openSUSE Product: openSUSE 12.3 Version: Final Platform: Other OS/Version: openSUSE 12.3 Status: NEW Severity: Major Priority: P5 - None Component: libzypp AssignedTo: zypp-maintainers@forge.provo.novell.com ReportedBy: abittner@abittner.de QAContact: qa-bugs@suse.de Found By: --- Blocker: --- User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:30.0) Gecko/20100101 Firefox/30.0 boot partition gets hosed filled to the last byte or sometimes corrupted before actual kernel update succeeds (e.g. menu.lst from grub erased but as boot was full no new menu.lst could be written to boot again and thus rendering whole system disabled) please add more sanitychecks to operations on a separate /boot/ partition or where /boot/ resides in general and do NOT remove these essential files such as menu.cfg *before* putting the new file there. workflow should be rather first write the new file to a new name, only if that succeeds remove the oldfile with the proper name and then rename the new file to the proper name. today i just had a zypper ref and zypper up with a kernel update incoming on a 12.3 x86 system again which again failed in the kernel rpm this time filling up the /boot/ 100%, it didnt actually kill the system this time, but i need to first figure out now what to safely erase from the crowded /boot/ partition now first before then again retriggering the zypper dup. why can the kernel updates not clean older kernels (the system seems to have multiple initrd symver and vmlinuz and vmlinux files in /boot/ in different versions, all versioned 3.7.10, so they ought to be all from older kernel updates and stages on this opensuse 12.3, but why are multiples being kept leading to the /boot/ partition eventually being filled and failing these kernel updates therafter and even rendering machines disabled :( please make kernel updates, zypper, or whatever involved components here or handling of free space on disk much more robust, and do not delete files before their new version has successfully been written to the disk first. thank you. Reproducible: Sometimes Steps to Reproduce: 1. 2. 3. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c1
Michael Andres
boot partition gets hosed filled to the last byte or sometimes corrupted before actual kernel update succeeds (e.g. menu.lst from grub erased but as boot was full no new menu.lst could be written to boot again and thus rendering whole system disabled)
That's probably something that needs to be addressed in mkinitrd and/or the kernels pre/post scripts.
why can the kernel updates not clean older kernels (the system seems to have multiple initrd symver and vmlinuz and vmlinux files in /boot/ in different versions, all versioned 3.7.10, so they ought to be all from older kernel updates and stages on this opensuse 12.3, but why are multiples being kept leading to the /boot/ partition eventually being filled and failing these kernel updates therafter and even rendering machines disabled :(
mkinitrd provides a tool for this (/sbin/purge-kernels). It's config option is availabe in /etc/zypp/zypp.conf: ## Comma separated list of kernel packages to keep installed in parallel, ## if the above multiversion variable is set. Packages can be specified as ## 2.6.32.12-0.7 - Exact version to keep ## latest - Keep kernel with the highest version number ## latest-N - Keep kernel with the Nth highest version number ## running - Keep the running kernel ## oldest - Keep kernel with the lowest version number (the GA kernel) ## oldest+N - Keep kernel with the Nth lowest version number ## ## Note: This entry is not evaluated by libzypp, but by the ## purge-kernels service (via /sbin/purge-kernels). ## ## Default: Do not delete any kernels if multiversion(kernel) is set multiversion.kernels = latest,latest-1,running Passing the issue to mkinitrd maintainer. Please forward it to the kernel guys if necessary. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c2
--- Comment #2 from andreas bittner
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c3
--- Comment #3 from andreas bittner
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c4
Michal Marek
happened again:
( 5/16) Installing: kernel-desktop-3.7.10-1.40.1 ..................................................................................[error] Installation of kernel-desktop-3.7.10-1.40.1 failed: (with --nodeps --force) Error: Subprocess failed. Error: RPM failed: installing package kernel-desktop-3.7.10-1.40.1.i686 needs 16MB on the /boot filesystem Abort, retry, ignore? [a/r/i] (a):
So it failed safe, didn't it?
/dev/sda2 99M 99M 0 100% /boot
Linux tux.local 3.7.10-1.36-desktop #1 SMP PREEMPT Thu Jun 12 10:14:12 UTC 2014 (fcb6f8f) i686 athlon i386 GNU/Linux
rpm -aq | grep -i kerne kernel-firmware-20130714git-1.9.1.noarch kernel-desktop-3.7.10-1.36.1.i686 kernel-desktop-3.7.10-1.32.1.i686
If your /boot partition cannot host three kernels, then you either have to enlarge it or disable multiversion.
-rw------- 1 root root 32099815 May 21 11:42 initrd-3.7.10-1.32-desktop
Can you attach the output of zcat /boot/initrd-3.7.10-1.32-desktop | cpio -tv? -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c5
andreas bittner
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c6
Michal Marek
This should be the main point of this bugreport. The install/upgrade scripts and components of kernels (and related) should better copy the config file to a new element, edit/modify and save the updated file and only if this fully succeeds should the original (still working, still existing, still completely stored on disk) file be deleted/renamed/removed.
I dont care if multikernels or not or whatever, this is a pita bug and crazy behavior.
Copy on write or whatever terms come to my mind as overall algorithms. You just musnt delete valuable config files of systems and then fail to write new files to the systems storage and then basically kill the users systems major functionality (to boot).
OK, but this is a bug in the grub2 tools then. The kernel does not create any config files.
Coming back with requested info, i dont have that initrd any more obviously i had to make space some way or another so i deleted the oldest stuff, but i can provide the more recent initrds output.
Is this initrd also over 30MB in size?
sudo zcat initrd-3.7.10-1.36-desktop | cpio -tv | more 122895 blocks
Please, redirect the output to a file and attach that file. This is unreadable. -- Configure bugmail: https://bugzilla.novell.com/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug.
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c7
--- Comment #7 from andreas bittner
https://bugzilla.novell.com/show_bug.cgi?id=886673
https://bugzilla.novell.com/show_bug.cgi?id=886673#c8
andreas bittner
participants (1)
-
bugzilla_noreply@novell.com