[Bug 1155170] New: /boot and /boot/efi not mounted
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170 Bug ID: 1155170 Summary: /boot and /boot/efi not mounted Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Factory Status: NEW Severity: Major Priority: P5 - None Component: Basesystem Assignee: bnc-team-screening@forge.provo.novell.com Reporter: opensuse@mike.franken.de QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- When rebooting after the last two Tumbleweed snapshots containing new kernel packages, the sytem didn't come up with the just installed kernel, but with the previous one. Further investigation shows, that the latest kernel has been installed to /boot in the root partition, because /boot hasn't been mounted. Besides that /boot/efi also hasn't been mounted. Manual mounting both of them worked as expected, but automatic mounts don't take place any longer since then. systemd mount units boot.mount and boot-efi.mount exist in /run/systemd, though. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c1
--- Comment #1 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c2
--- Comment #2 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c4
--- Comment #4 from Michael Hirmke
Could you please paste your /etc/fstab here? Are you sure that you have entries for /boot and /boot/efi on it? Also the log piece you paste shows
If not, why would I have systemd[1]: Mounting /boot... systemd[1]: Mounted /boot. in the logs? And why would it work, when the shutdown before was clean? fstab: UUID=bf69d826-e322-4c61-b071-a57180042bca /boot ext4 defaults 0 2 UUID=F8C7-52FF /boot/efi vfat defaults 0 0
nothing about the problem, could you please upload full journal?
As I wrote - there are no more (error) messages regarding this problem in the journal. Believe me - I am a maniac regarding warnings or errors in my logas. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
Alynx Zhou
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c5
--- Comment #5 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c6
--- Comment #6 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c7
--- Comment #7 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c8
--- Comment #8 from Michael Hirmke
4. When the system is up and running, /boot and /boot/efi are not mounting.
mounted. should read ^^^^^^^
-- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c9
--- Comment #9 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c11
--- Comment #11 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
Dimitrios Apostolou
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c12
Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c15
--- Comment #15 from Michael Hirmke
No sorry, I never encountered a similar issue before.
(In reply to Michael Hirmke from comment #5)
This weekend I spent some time to analyze this problem in depth. The results are:
Thanks for your time and analysis, it will definitively help pinpointing your problem.
[...]
2. When rebooting after this installation I get:
systemd-cryptsetup[27794]: Device root is still in use. systemd-cryptsetup[27794]: Failed to deactivate: Device or resource busy systemd[1]: systemd-cryptsetup@root.service: Control process exited, code=exited, status=1/FAILURE
The root partition on this system is encrypted, boot is not.
Maybe we should focus on this shutdown issue first ?
The fact that /boot and /boot/efi are not mounted on the next reboot seems to be a consequence of it. I'm not saying that we shouldn't look at it but maybe understanding the first issue (the shutdown one) will help us understanding (and reproducing) the second one (/boot and /boot/efi not mounted) ?
Whatever may be necessary to solve this problem :) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c16
Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c17
--- Comment #17 from Michael Hirmke
@Michael, the shutdown logs that show the failure is quite big and hard to parse. Apparently the system was resumed from hibernation at least once.
Can you try to reproduce the shutdown issue after rebooting and without suspending your system so the logs will be much shorter and easier to parse ?
To provide the logs of the previous boot, please use "journalctl -b -1"
Thanks.
ok - have to wait for the nex snapshot, though. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c18
--- Comment #18 from Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c19
--- Comment #19 from Michael Hirmke
Can't you simply install a new package instead ?
No, doesn't seem so. Just tried to install, reinstall and uninstall a package, but the problem didn't occur after a reboot. Perhaps the problem only occurs, if the install itself requires a reboot!?!? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c20
--- Comment #20 from Franck Bui
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c21
--- Comment #21 from Michael Hirmke
Can you try to restore a previous snapshot so you can replay the upgrade that lead to the boggus shutdown ?
I never reverted a snapshot - isn't this possible with btrfs only? The problem occurs on every zypper dup, though, so the next one should be sufficient. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c22
--- Comment #22 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c23
--- Comment #23 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c24
--- Comment #24 from Franck Bui
A lot of testing and many boot cycles later with and without running zypper, hibernating and resuming in between, it seems, that I have two problems.
yes that was my assumption too but I thought that 2. was triggered by 1. which appears to not be the case according to your new findings. Just before providing more feedback, can you tell me which FS is used for / ? Also did you provide any logs when /boot wasn't mounted as it should have ? I don't think so but just prefer making sure. Thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c25
--- Comment #25 from Michael Hirmke
(In reply to Michael Hirmke from comment #23)
A lot of testing and many boot cycles later with and without running zypper, hibernating and resuming in between, it seems, that I have two problems.
yes that was my assumption too but I thought that 2. was triggered by 1. which appears to not be the case according to your new findings.
Just before providing more feedback, can you tell me which FS is used for / ?
All filesystems except of course /boot/efi are ext4.
Also did you provide any logs when /boot wasn't mounted as it should have ? I don't think so but just prefer making sure.
The first log "shutdown-log-err.txt" in the attachement named "two shutdown logs" shows this problem. Because this problem only occurs after hibernating/resuming the system, I assumed the log mentioned would be sufficient even if it is big. A new one wouldn't be much smaller. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c26
--- Comment #26 from Franck Bui
The first log "shutdown-log-err.txt" in the attachement named "two shutdown logs" shows this problem.
Ok I'll take a closer look then. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c27
--- Comment #27 from Franck Bui
1. "Boot device is still in use" on a reboot - the reason for this seems to be an installation via zypper.
So regarding this issue, the problem is that systemd shouldn't attempt to detach the root device at all since it's going to be used until the very last end, when the system will switch back to initrd. Since this problem is not specific to openSUSE, I opened an issue against upstream and it will be tracked at https://github.com/systemd/systemd/issues/14224 from now on. That said the warning should be harmless and the root device should be unmounted and detached by dracut (Thomas, please correct me if I'm wrong). -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c28
--- Comment #28 from Franck Bui
2. /boot and /boot/efi are not mounted after a reboot - this doesn't occur, if the system is rebooted before running zypper. It only occurs, if the system has been hibernated/resumed after the last reboot and before running zypper.
Could you just check that after hibernating/resuming /boot and /boot/efi are still mounted ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c29
--- Comment #29 from Michael Hirmke
(In reply to Michael Hirmke from comment #23)
2. /boot and /boot/efi are not mounted after a reboot - this doesn't occur, if the system is rebooted before running zypper. It only occurs, if the system has been hibernated/resumed after the last reboot and before running zypper.
Could you just check that after hibernating/resuming /boot and /boot/efi are still mounted ?
They must have been mounted, the last times zypper dup has installed new kernels. Otherwise the kernel files would have been installed in the /boot directory on the root partition instead in the /boot partition. And because I always hibernate/resume this system, and reboots only happen, when a zypper dup requires them, I'm pretty sure, that zypper has run always after at least one hibernate/resume cycle. But I'll double check, if on a simple hibernate/resume cycle the partitons are mounted or not. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c30
--- Comment #30 from Michael Hirmke
(In reply to Michael Hirmke from comment #23)
1. "Boot device is still in use" on a reboot - the reason for this seems to be an installation via zypper.
So regarding this issue, the problem is that systemd shouldn't attempt to detach the root device at all since it's going to be used until the very last end, when the system will switch back to initrd.
Since this problem is not specific to openSUSE, I opened an issue against upstream and it will be tracked at https://github.com/systemd/systemd/issues/14224 from now on.
Thx a lot!
That said the warning should be harmless and the root device should be unmounted and detached by dracut (Thomas, please correct me if I'm wrong).
At least there is no message, that root has been unmounted correctly. Instead you can see in the log, that the last fuser output shows lots of processes blockkung the root filesystem. Instead you can see messages like systemd-cryptsetup@root.service: Unit entered failed state. Remember: root is encrypted! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c31
--- Comment #31 from Franck Bui
At least there is no message, that root has been unmounted correctly. Instead you can see in the log, that the last fuser output shows lots of processes blockkung the root filesystem. Instead you can see messages like
systemd-cryptsetup@root.service: Unit entered failed state.
Remember: root is encrypted!
Sorry but I don't see your point. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c32
--- Comment #32 from Franck Bui
The first log "shutdown-log-err.txt" in the attachement named "two shutdown logs" shows this problem.
I double checked and this appears to be wrong: "shutdown-log-err.txt" includes only 2 suspend/resume cycles and one single shutdown. It doesn't show what happened after the shutdown, i.e the next boot where /boot and /boot/efi are supposed to not be mounted. So I would suggest to reproduce the issue one more time but with debug logs enabled only when rebooting the system with /boot and /boot/efi not mounted. Also please make sure to answer the question in comment #28. Thanks. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c33
--- Comment #33 from Michael Hirmke
(In reply to Michael Hirmke from comment #25)
The first log "shutdown-log-err.txt" in the attachement named "two shutdown logs" shows this problem.
I double checked and this appears to be wrong: "shutdown-log-err.txt" includes only 2 suspend/resume cycles and one single shutdown.
It doesn't show what happened after the shutdown, i.e the next boot where /boot and /boot/efi are supposed to not be mounted.
Oops 8-<
So I would suggest to reproduce the issue one more time but with debug logs enabled only when rebooting the system with /boot and /boot/efi not mounted.
Ok. I'll do it after the next zypper dup.
Also please make sure to answer the question in comment #28.
Wasn't the answer from comment #28 sufficient? In the meantime I checked again - and yes, both filesystems are mounted after even a few hibernate/resume cycles. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c34
--- Comment #34 from Michael Hirmke
Also please make sure to answer the question in comment #28.
Wasn't the answer from comment #28 sufficient?
#29 of course. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c35
--- Comment #35 from Michael Hirmke
(In reply to Michael Hirmke from comment #30)
At least there is no message, that root has been unmounted correctly. Instead you can see in the log, that the last fuser output shows lots of processes blockkung the root filesystem. Instead you can see messages like
systemd-cryptsetup@root.service: Unit entered failed state.
Remember: root is encrypted!
Sorry but I don't see your point.
I'm not sure how to express in a better way 8-< You said, the messages are harmless, but an encrypted filesystem could be damaged, as far as I know, if not properly unmounted. And obviously it is not unmounted correctly - according to the message. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c36
--- Comment #36 from Franck Bui
You said, the messages are harmless, but an encrypted filesystem could be damaged, as far as I know, if not properly unmounted.
As I wrote it should be harmless *because* dracut is supposed to unmount it at the end even if systemd failed to do so before. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c37
--- Comment #37 from Michael Hirmke
(In reply to Michael Hirmke from comment #35)
You said, the messages are harmless, but an encrypted filesystem could be damaged, as far as I know, if not properly unmounted.
As I wrote it should be harmless *because* dracut is supposed to unmount it at the end even if systemd failed to do so before.
Got it - thx! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c38
--- Comment #38 from Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c39
Franck Bui
[ 18.786855] systemd[1]: Switching root. [ 19.188536] systemd[1]: dev-nvme0n1p3.device: Changed dead -> plugged [ 19.300921] systemd[1]: boot.mount: About to execute: /usr/bin/mount /dev/disk/by-uuid/bf69d826-e322-4c61-b071-a57180042bca /boot -t ext4 [ 19.415483] systemd[1]: Reloading. [ 19.728841] systemd[1]: dev-nvme0n1p3.device: Changed dead -> plugged [ 19.729160] systemd[1]: boot.mount: Changed dead -> mounted [ 19.729896] systemd[1]: dev-nvme0n1p3.device: Changed plugged -> dead [ 19.773765] systemd[1]: boot.mount: About to execute: /usr/bin/umount /boot -c
Unfortunately this one is pretty nasty but is still not addressed. It's been reported to us already several times, see https://bugzilla.suse.com/show_bug.cgi?id=1137373 for the original report and is tracked by upstream here: https://github.com/systemd/systemd/issues/12953. I'm not really sure why you're only facing it after doing an upgrade, but since this issue is a race I guess this is possible. So I'm going to close your bug report as a duplicate of bsc#1137373. *** This bug has been marked as a duplicate of bug 1137373 *** -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c40
Michael Hirmke
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170
http://bugzilla.opensuse.org/show_bug.cgi?id=1155170#c41
Michael Hirmke
I found out, that I don't even have to install something. It is enough to run zypper dup. Whenever the rpeos are updated, the problem occurs - in every single case. Running zypper dup, where the repos didn't change, doesn't give the problem. So how can it be a race condition, when it happens every time. It is completely reproducable 8-<
Uh, I see - Lennart was postponing the fix again. Ok, I'll wait. *** This bug has been marked as a duplicate of bug 1137373 *** -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com