http://bugzilla.suse.com/show_bug.cgi?id=1149980
Bug ID: 1149980 Summary: cleanup target system unmounting in umount_finish Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Installation Assignee: yast2-maintainers@suse.de Reporter: snwint@suse.com QA Contact: jsrain@suse.com Found By: --- Blocker: ---
Created attachment 817376 --> http://bugzilla.suse.com/attachment.cgi?id=817376&action=edit yast log
umount_finish (https://github.com/yast/yast-installation/blob/master/src/lib/installation/c...)
unmounts the filesystems that have been mounted to the target system during the installation (i.e. everything below /mnt).
It does so *twice*.
As it usually succeeds in the first iteration, the second iteration mostly fails.
And as there's quite some fallback stuff programmed in (like lazy unmounting and ro-remounting) this leads to quite a mess in the logs and some unneeded activities.
See attached log fragment for an example.
http://bugzilla.suse.com/show_bug.cgi?id=1149980 http://bugzilla.suse.com/show_bug.cgi?id=1149980#c1
Steffen Winterfeldt snwint@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Priority|P5 - None |P3 - Medium Status|NEW |CONFIRMED URL| |https://trello.com/c/XaqRXa | |bj Assignee|yast2-maintainers@suse.de |yast-internal@suse.de
--- Comment #1 from Steffen Winterfeldt snwint@suse.com --- Tracking in YaST Scrum board.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c2
--- Comment #2 from Stefan Hundhammer shundhammer@suse.com --- From that y2log it appears that it's the Btrfs subvolumes that cause all those confusing errors in the log.
All entries in /proc/mounts (reformatted and minus the mount option to avoid Bugzilla mangling it completely):
tmpfs / tmpfs tmpfs / tmpfs proc /proc proc sysfs /sys sysfs /dev/loop0 /parts/mp_0000 squashfs /dev/loop1 /parts/mp_0001 squashfs devtmpfs /dev devtmpfs devpts /dev/pts devpts rpc_pipefs /var/lib/nfs/rpc_pipefs rpc_pipefs /dev/loop2 /mounts/mp_0000 squashfs /dev/loop3 /mounts/mp_0001 squashfs /dev/loop5 /mounts/mp_0003 squashfs /dev/loop6 /mounts/mp_0004 squashfs /dev/sda2 /mnt btrfs /dev/sda2 /mnt/.snapshots btrfs /dev/sda2 /mnt/boot/grub2/i386-pc btrfs /dev/sda2 /mnt/boot/grub2/x86_64-efi btrfs /dev/sda2 /mnt/home btrfs /dev/sda2 /mnt/opt btrfs /dev/sda2 /mnt/root btrfs /dev/sda2 /mnt/srv btrfs /dev/sda2 /mnt/tmp btrfs /dev/sda2 /mnt/usr/local btrfs /dev/sda2 /mnt/var btrfs devtmpfs /mnt/dev devtmpfs proc /mnt/proc proc sysfs /mnt/sys sysfs tmpfs /mnt/run tmpfs
But we only need mounts to /mnt, so let's grep for that:
/dev/sda2 /mnt btrfs /dev/sda2 /mnt/.snapshots btrfs /dev/sda2 /mnt/boot/grub2/i386-pc btrfs /dev/sda2 /mnt/boot/grub2/x86_64-efi btrfs /dev/sda2 /mnt/home btrfs /dev/sda2 /mnt/opt btrfs /dev/sda2 /mnt/root btrfs /dev/sda2 /mnt/srv btrfs /dev/sda2 /mnt/tmp btrfs /dev/sda2 /mnt/usr/local btrfs /dev/sda2 /mnt/var btrfs devtmpfs /mnt/dev devtmpfs proc /mnt/proc proc sysfs /mnt/sys sysfs tmpfs /mnt/run tmpfs
Notice that all the snapshots on the root filesystem are also there. But unmounting them separately is what fails with that "filesystem busy" error, so that's what we need to avoid.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c3
--- Comment #3 from Stefan Hundhammer shundhammer@suse.com --- The trouble is that since we are also installing into a snapshot (i.e. subvolume), it's very hard to tell them apart:
/dev/sda2 /mnt btrfs rw,relatime,space_cache,subvolid=268,subvol=/@/.snapshots/1/snapshot 0 0
/dev/sda2 /mnt/.snapshots btrfs rw,relatime,space_cache,subvolid=267,subvol=/@/.snapshots 0 0
/dev/sda2 /mnt/boot/grub2/i386-pc btrfs rw,relatime,space_cache,subvolid=266,subvol=/@/boot/grub2/i386-pc 0 0
... ...
/dev/sda2 /mnt/var btrfs rw,relatime,space_cache,subvolid=258,subvol=/@/var 0 0
Filtering out all entries with "subvolid" or "subvol=" leaves nothing.
We also can't only take the first entry: If the user decides to create a separate /home and format that with Btrfs as well, that would leave /mnt/home mounted. Worse, the user might decide to also create subvolumes and/or prepare for snapshots on that separate /home (unlikely, but possible).
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c4
--- Comment #4 from Stefan Hundhammer shundhammer@suse.com --- Deciding which of the entries in /proc/mounts is a true toplevel Btrfs and which is only a subvolume is very hard; even more so if it is possible that there are SEVERAL toplevel Btrfs mounts.
One approach might be to check each entry during unmounting avain if it's even mounted: A previous unmount might have taken care of it as well. Of course that works only if we strictly follow the order of /proc/mounts.
But that is in contradiction of the other requirement: Unmount from deeper nesting upwards to the target's root because otherwise the root will still be busy. The same applies to mounts like /boot that might have more mounts inside them.
/ ������/boot ��� ������/boot/efi ��� ������/boot/morestuff
So we NEED to start unmounting from the deepest mounts upwards.
For Btrfs (i.e. all mounts containing mount options "subvolid" or "subvol=") it is tempting to make an exception and start from the top. That works well as long as it's only subvolumes of that same Btrfs mounted inside it.
But it is entirely possible that another Btrfs is mounted to that Btrfs, and that might also have those mount options "subvolid" or "subvol=". In that case, it would not be unmounted first, and unmounting the toplevel Btrfs would fail with "filesystem busy".
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c5
--- Comment #5 from Stefan Hundhammer shundhammer@suse.com --- So the only realistic option I see is to resort to yast-storage's device graph and its knowledge about subvolumes: Filter those paths out when unmounting.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c6
--- Comment #6 from Stefan Hundhammer shundhammer@suse.com --- It turns out that you can unmount btrfs subvolumes just fine; but at least one of them, /mnt/var, tends to be busy because libzypp uses /mnt/var/cache/zypp/packages/... for its package cache. If unmounting that one fails with a "busy" error, unmounting its parent Btrfs main volume at /mnt also fails with the same error.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c7
--- Comment #7 from Stefan Hundhammer shundhammer@suse.com --- Pull request: https://github.com/yast/yast-installation/pull/975
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c8
--- Comment #8 from Stefan Hundhammer shundhammer@suse.com --- Related: New bug #1189793 "libzypp cache keeps /mnt/var mount busy" that I wrote when I tested my when I tested my changes.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c13
--- Comment #13 from Lukas Ocilka locilka@suse.com --- BTW, for the libzypp cache, it might work to stop using libzypp completely? IMO we still might keep some reference in the memory. Or maybe we need to call libzypp directly? Maybe MA knows?
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c14
--- Comment #14 from Steffen Winterfeldt snwint@suse.com --- Thanks; the log looks much better now and it even gives useful hints.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c15
--- Comment #15 from Steffen Winterfeldt snwint@suse.com --- I would have expected that after these steps in pre_umount_finish
https://github.com/yast/yast-installation/blob/master/src/lib/installation/c...
zypp should be gone.
Otherwise I don't see where else it should come from considering our workflow:
https://github.com/yast/yast-installation/blob/master/src/lib/installation/c...
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c16
--- Comment #16 from Stefan Hundhammer shundhammer@suse.com --- The emergency umount code:
https://github.com/yast/yast-installation/blob/master/startup/First-Stage/F1...
see also bug #1189793 comment #7 for why we shouldn't rely on that and get this fixed before it escalates to that last ditch defense script.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c17
Stefan Hundhammer shundhammer@suse.com changed:
What |Removed |Added ---------------------------------------------------------------------------- Status|CONFIRMED |RESOLVED Resolution|--- |FIXED
--- Comment #17 from Stefan Hundhammer shundhammer@suse.com --- Closing this refactoring bug after the PR is merged.
For the remaining open mount with the libzypp cache, see the separate bug #1189793.
https://bugzilla.suse.com/show_bug.cgi?id=1149980 https://bugzilla.suse.com/show_bug.cgi?id=1149980#c18
--- Comment #18 from Stefan Hundhammer shundhammer@suse.com --- SR to OBS:
https://build.opensuse.org/request/show/914406
SR to IBS:
https://build.suse.de/request/show/248874