[Bug 1099745] New: [kubic][transactional server] systemd fails to boot, reports $subvolume already mounted
http://bugzilla.opensuse.org/show_bug.cgi?id=1099745 Bug ID: 1099745 Summary: [kubic][transactional server] systemd fails to boot, reports $subvolume already mounted Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Critical Priority: P5 - None Component: Other Assignee: fbui@suse.com Reporter: rbrown@suse.com QA Contact: qa-bugs@suse.de CC: iforster@suse.com, kukuk@suse.com Found By: --- Blocker: --- Created attachment 775783 --> http://bugzilla.opensuse.org/attachment.cgi?id=775783&action=edit journal -b with debug log enabled This bug has already been debugged significantly by myself and Franck but for completeness the report as it stands right now # SYMPTOMS On Transactional Tumbleweed or Kubic systems there is a clear race condition where on some boots the mounting of the 1 or more various .mount units required by local-fs.target fails with errors like the following:
mount[661]: mount: /boot/grub2/x86_64-efi: /dev/sda2 already mounted on /.
The sub-volume is not consistant, and on occasion more than one subvolume will fail to be mounted with the same error. Once the user enters the Emergency shell, any attempt to mount the subvolume works perfectly fine - it is not "already mounted" by the time the sysadmin can login to the Emergency shell # STEPS TO REPRODUCE Install Tumbleweed with a transactional server role, all default settings create a cronjob to run "reboot -f" every minute wait until the system fails to boot This has been reproduced on an Intel i5 NUC somewhat reliably. It occurs on average approximately every dozen reboots. To rule out any kind of bus/hardware issues it has been successfully reproduced on the following devices holding the systems rootfs: SATA 7200 HDD SATA SSD M2 SSD USB SD Card (Multiple) USB Pen Drive Enabling systemd debug logging seems to slow things down enough that this bug is much harder to catch, but not impossible - debug logs are attached # USER IMPACT Due to the failure of the boot at this point, systemd drops to the emergency shell The system is therefore unusable, hence the critical severity of this bug. Transactional Tumbleweed & Kubic machines reboot regularly (as a result of every package change, update, etc), further increasing the severity of this bug - if the bug occurs on average every 12 boots, users can expect one major outage of each system at least every month. Neither distribution can be considered reliable in normal operation until this bug is mitigated or resolved. # AFFECTED SYSTEMS (Tested, able to reproduce) Tumbleweed (Transactional Server Role) Kubic (All Installations) # SUSPECTED AFFECTED SYSTEMS Future CaaS Platform versions and SLE 15 SPx (with Transactional updates) using systemd versions equal or later than currently in Tumbleweed # UNAFFECTED SYSTEMS (Tested, unable to reproduce) Leap 15 (Including Transactional Server Role) Tumbleweed (non transactional roles) # SUSPECTED UNAFFECTED SYSTEMS SLE 15 GA # PRELIMINARY HYPOTHESIS As this doesn't happen on all Tumbleweed systems this bug is likely triggered by the presence of /var (btrfs subvolume) and /etc (overlayfs related to var) in fstab-sys in initrd. This is needed on a transactional system so the contents of /etc's overlayfs can be read by the initrd. It is not present on non-transactional Tumbleweed system roles. However Leap 15 transactional systems have an identical initrd & transactional update configuration. As this doesn't happen on Leap 15 systems, this bug is clearly triggered by changes to systemd introduced in versions later than that in Leap 15 and SLE 15 My hypothesis is that one of the changes introduced recently seems to be attempting to mount subvolumes too early. I suspect systemd might be unmounting the devices uses by the initrd before remounting them as part of local-fs.target. If systemd is not waiting long enough for the unmount to complete before attempting to mount the first .mount target, that could match the observed behaviour. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099745 http://bugzilla.opensuse.org/show_bug.cgi?id=1099745#c2 --- Comment #2 from Franck Bui <fbui@suse.com> ---
From the debug logs, the mount ordering looks correct so I can't explain why mounting /boot/grub2/x86_64-efi fails.
For the record, the command is: /usr/bin/mount /dev/disk/by-uuid/d4da036b-b69a-4717-9d82-935e0d764cd8 /boot/grub2/x86_64-efi -t btrfs -o subvol=/@/boot/grub2/x86_64-efi and fails with return code "32", which basically means that the mount syscall fails somehow I think. So something went wrong in the kernel I would say. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099745 http://bugzilla.opensuse.org/show_bug.cgi?id=1099745#c3 --- Comment #3 from Franck Bui <fbui@suse.com> --- (In reply to Thorsten Kukuk from comment #1)
We had already once in the past the problem, that mounting of several subvolumes in parallel did lead to an error. Maybe again a problem with btrfs?
Interesting, do you have a pointer to share by chance ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099745 http://bugzilla.opensuse.org/show_bug.cgi?id=1099745#c4 --- Comment #4 from Franck Bui <fbui@suse.com> --- (In reply to Richard Brown from comment #0)
Once the user enters the Emergency shell, any attempt to mount the subvolume works perfectly fine - it is not "already mounted" by the time the sysadmin can login to the Emergency shell
[...]
Due to the failure of the boot at this point, systemd drops to the emergency shell
The system is therefore unusable, hence the critical severity of this bug.
From the emergency shell, you should be able to mount manually the subvolume and resume the boot proces by exiting from the emergency shell, no ?
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1099745 http://bugzilla.opensuse.org/show_bug.cgi?id=1099745#c5 --- Comment #5 from Richard Brown <rbrown@suse.com> --- (In reply to Franck Bui from comment #4)
The system is therefore unusable, hence the critical From the emergency shell, you should be able to mount manually the subvolume and resume the boot proces by exiting from the emergency shell, no ?
Sure, but given the whole point of transactional updates is to create hands off, fully automatically updating systems, often large clusters of such unattended machines, the fact that a user who might have out of band access can manually get the system running again doesn’t really mitigate the severity of this issue. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com