[Bug 1222429] New: btrfs does not know norecovery since 6.8
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Bug ID: 1222429 Summary: btrfs does not know norecovery since 6.8 Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: YaST2 Assignee: yast2-maintainers@suse.de Reporter: jslaby@suse.com QA Contact: jsrain@suse.com Target Milestone: --- Found By: --- Blocker: --- This is a clone of: https://github.com/yast/yast-update/issues/197 due to no responses in there. In yast-update/src/modules/RootPart.rb, Line 41 in b3e78e0 there is: NORECOVERY_FS = [:btrfs, :ext3, :ext4, :xfs].freeze norecovery was deprecated in torvalds/linux@74ef001 (5.9) and finally removed in torvalds/linux@a1912f7 (6.8). Due to that, btrfs cannot be probed by yast now: btrfs: Unknown parameter norecovery It used to warn between 5.9 and 6.8: 'nologreplay' is deprecated, use 'rescue=nologreplay' instead Yast should adopt, but I don't know how to do it in a generic way... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Found By|--- |openQA See Also| |https://github.com/yast/yas | |t-update/issues/197 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c1 Lukas Ocilka <locilka@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |jslaby@suse.com Flags| |needinfo?(jslaby@suse.com) Found By|openQA |--- --- Comment #1 from Lukas Ocilka <locilka@suse.com> --- Jiri, please, add a little bit more about your use-case. E.g., how exactly did you get to the state of seeing hitting error, e.g., how to reproduce it. Plus YaST logs would be nice too. See more at https://en.opensuse.org/openSUSE:How_to_Write_a_Good_Bugreport -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Lukas Ocilka <locilka@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Found By|--- |openQA -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c2 Ancor Gonzalez Sosa <ancor@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ancor@suse.com --- Comment #2 from Ancor Gonzalez Sosa <ancor@suse.com> --- (In reply to Jiri Slaby from comment #0)
Yast should adopt, but I don't know how to do it in a generic way...
I fear there is no "generic way". That RootPart.rb file implements almost all the logic for system upgrades in a one-script-to-rule-everything way, not relying too much in the rest of the YaST infrastructure. So we probably just need to adjust that list pointed in the report. I don't think there is another usage of norecovery in any other part of YaST. For context, the usage of "norecovery" during upgrade was introduced here: https://github.com/yast/yast-update/pull/180 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c3 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |dsterba@suse.com, | |fdmanana@suse.com, | |wqu@suse.com Flags|needinfo?(jslaby@suse.com) | --- Comment #3 from Jiri Slaby <jslaby@suse.com> --- (In reply to Ancor Gonzalez Sosa from comment #2)
So we probably just need to adjust that list pointed in the report. I don't think there is another usage of norecovery in any other part of YaST.
Yeah, but I had older distros in mind. I am not sure when that 'rescue=nologreplay' was introduced. So if you put yast with this change into SLES, it might not be working. (I can check.) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c4 --- Comment #4 from Jiri Slaby <jslaby@suse.com> --- So rescue=nologreplay was introduced in commit 74ef00185eb864252156022ff129b01549504175 Author: Qu Wenruo <wqu@suse.com> Date: Thu Jun 4 15:18:06 2020 +0800 btrfs: introduce "rescue=" mount option in 5.9, so at least 15-sp5+ all contain that. I have no idea what yast workflow to take patches/new version into SLE is though. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c5 Wenruo Qu <wqu@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(jslaby@suse.com) --- Comment #5 from Wenruo Qu <wqu@suse.com> --- Personally speaking, if you're going to use "rescue=" option, your fs is already screwed up on most cases. Furthermore, most rescue= options would require ro mount option, and can not be remounted to RW, this is by design. So I do not think we need "norecovery" mount option for btrfs at all. Any more explanation on the "norecovery" mount option usage? Maybe I'm missing something critical. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c6 Jiri Slaby <jslaby@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(jslaby@suse.com) | --- Comment #6 from Jiri Slaby <jslaby@suse.com> --- (In reply to Wenruo Qu from comment #5)
Personally speaking, if you're going to use "rescue=" option, your fs is already screwed up on most cases.
Furthermore, most rescue= options would require ro mount option, and can not be remounted to RW, this is by design.
So I do not think we need "norecovery" mount option for btrfs at all.
I have no idea :). Talk to yast people in here...
Any more explanation on the "norecovery" mount option usage? Maybe I'm missing something critical.
So see also: (In reply to Ancor Gonzalez Sosa from comment #2)
For context, the usage of "norecovery" during upgrade was introduced here: https://github.com/yast/yast-update/pull/180 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c7 --- Comment #7 from Jiri Slaby <jslaby@suse.com> --- (In reply to Wenruo Qu from comment #5)
Furthermore, most rescue= options would require ro mount option, and can not be remounted to RW, this is by design.
Note that exactly happens: mount_options = ["ro"] mount_options << "norecovery" if NORECOVERY_FS.include?(freshman[:fs]) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c8 --- Comment #8 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #6)
Any more explanation on the "norecovery" mount option usage? Maybe I'm missing something critical.
So see also: (In reply to Ancor Gonzalez Sosa from comment #2)
For context, the usage of "norecovery" during upgrade was introduced here: https://github.com/yast/yast-update/pull/180
Which apparently resulted from bug 1195894. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c9 --- Comment #9 from Wenruo Qu <wqu@suse.com> --- Thanks for the info on Bug 1195894. It really explains more than what I need. So "norecovery" is to prevent log/journal replay, and btrfs' equivalent is "rescue=nologreplay". Personally speaking, let the installer just do a "ro" mount would be more than enough. Unless the fs is already corrupted and log/journal replay could fail. And even for a log replay failure case, unless it's something only fixed in newer kernels, it doesn't make much difference. That's why most fses replay their logs even in RO mount, because that's needed to recover the not yet committed data. And I do not think who is replaying the log (either the installer, or the kernel of that partition on next boot) does any difference. As a quick fix, just moving btrfs out of "NORECOVERY_FS" would be sufficient to me. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c10 --- Comment #10 from Jiri Slaby <jslaby@suse.com> --- (In reply to Wenruo Qu from comment #9)
As a quick fix, just moving btrfs out of "NORECOVERY_FS" would be sufficient to me.
But that would replay the log on a mounted filesystem which would brick it for the mounter, right? Or what am I missing? (I assume/understand that ro replays the log.) AIUI, there is a cluster, someone has mounted a FS on it. There there is this installer (yast) which wants to probe all filesystems on the cluster, but not actually touch/write them, because they are mounted elsewhere. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c11 --- Comment #11 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #10)
(In reply to Wenruo Qu from comment #9)
As a quick fix, just moving btrfs out of "NORECOVERY_FS" would be sufficient to me.
But that would replay the log on a mounted filesystem which would brick it for the mounter, right? Or what am I missing? (I assume/understand that ro replays the log.)
AIUI, there is a cluster, someone has mounted a FS on it. There there is
*Then there I hate bugzilla has on "Edit" button.
this installer (yast) which wants to probe all filesystems on the cluster, but not actually touch/write them, because they are mounted elsewhere. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c12 --- Comment #12 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #11)
(In reply to Jiri Slaby from comment #10)
(In reply to Wenruo Qu from comment #9)
As a quick fix, just moving btrfs out of "NORECOVERY_FS" would be sufficient to me.
But that would replay the log on a mounted filesystem which would brick it for the mounter, right? Or what am I missing? (I assume/understand that ro replays the log.)
AIUI, there is a cluster, someone has mounted a FS on it. There there is
*Then there
I hate bugzilla has on "Edit" button.
F*cking bugzilla, no "Edit" button.
this installer (yast) which wants to probe all filesystems on the cluster, but not actually touch/write them, because they are mounted elsewhere. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c13 --- Comment #13 from Wenruo Qu <wqu@suse.com> --- (In reply to Jiri Slaby from comment #10)
(In reply to Wenruo Qu from comment #9)
As a quick fix, just moving btrfs out of "NORECOVERY_FS" would be sufficient to me.
But that would replay the log on a mounted filesystem which would brick it for the mounter, right? Or what am I missing? (I assume/understand that ro replays the log.)
AIUI, there is a cluster, someone has mounted a FS on it. There there is this installer (yast) which wants to probe all filesystems on the cluster, but not actually touch/write them, because they are mounted elsewhere.
Oh, I got the point, it's SAN, so there is nothing preventing double mounting from different clients. In that case it's true that we can not do any write, including log replay. Then we need to maintain a per-fs mount options to prevent log replaying. We choose not to adapt "norecovery" name, because it takes me quite several minutes to remember it's no log replay... Other than my initial idea to disable rescue mount options. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c14 --- Comment #14 from Lukas Ocilka <locilka@suse.com> --- (In reply to Jiri Slaby from comment #10)
AIUI, there is a cluster, someone has mounted a FS on it. There there is this installer (yast) which wants to probe all filesystems on the cluster, but not actually touch/write them, because they are mounted elsewhere.
Partially correct: YaST (Installer) needs to mount all filesystems to see what is in them to provide options for upgrade (e.g., all compatible products to upgrade "from"). But these filesystems are NOT mounted anywhere else. The point of r-o mounting is to keep the FS "as is" (also/or ideally with noatime as that may also change something). And THEN, when a system to upgrade from is chosen, then this one and all related filesystems are mounted r-w as the upgrade process is free to change them (in fact "intended" to change them). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c15 --- Comment #15 from Stefan Hundhammer <shundhammer@suse.com> --- The way we do that in the installer is we are probing all attached disks for existing filesystems, for installation as well as for a system upgrade. We need to do this in a non-destructive way (because we might detect a filesystem that belongs to another system, for example for parallel boot ), and also in a way that does not get the whole installation process stuck for a long time. In particular Btrfs was notorious for starting a long-overdue rebalancing at that time if a user was unlucky, and you know how long that can take. For a new installation, we use that information to calculate a proposed storage layout (what filesystems to reuse, which ones to delete, which ones not to touch at all). For an upgrade, we need to check which filesystems might be eligible for an upgrade; basically, which ones contain an existing older SUSE distro that we can offer as candidates for an upgrade. In both cases, we need to mount the filesystem to check for files such as etc/fstab, etc/crypttab, etc/os-release. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |aschnell@suse.com -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c16 --- Comment #16 from David Sterba <dsterba@suse.com> --- (In reply to Lukas Ocilka from comment #14)
The point of r-o mounting is to keep the FS "as is" (also/or ideally with noatime as that may also change something).
Please note that the 'ro' mount option semantics does not prevent writes to happen before the mount is finished, the writes are prevented from user space. IOW, log replay can still happen with 'ro' mount so you need some sort of log replay preventing option, or to temporarily make the block device read-only (blockdev --setro). -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c17 --- Comment #17 from Stefan Hundhammer <shundhammer@suse.com> --- (In reply to David Sterba from comment #16)
Please note that the 'ro' mount option semantics does not prevent writes to happen before the mount is finished, the writes are prevented from user space. IOW, log replay can still happen with 'ro' mount so you need some sort of log replay preventing option, or to temporarily make the block device read-only (blockdev --setro).
That's a good hint. I assume this affects only the currently running kernel and processes, and nothing is written to the device, correct? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|btrfs does not know |YaST uses dropped btrfs |norecovery since 6.8 |option 'norecovery' (no | |longer available since | |btrfs 6.8) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Assignee|yast2-maintainers@suse.de |yast-internal@suse.de Status|NEW |CONFIRMED URL| |https://trello.com/c/VVxT2V | |O4 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c18 --- Comment #18 from Stefan Hundhammer <shundhammer@suse.com> --- Moving to our Trello task queue. This probably deserves a higher priority. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c19 --- Comment #19 from Stefan Hundhammer <shundhammer@suse.com> --- AFAICS this is not used in libstorage-ng. Arvin, please check. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c20 --- Comment #20 from Arvin Schnell <aschnell@suse.com> --- (In reply to Stefan Hundhammer from comment #19)
AFAICS this is not used in libstorage-ng. Arvin, please check.
No, libstorage-ng does not use 'norecovery'. Neither does it use 'blockdev --setro'. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c21 --- Comment #21 from David Sterba <dsterba@suse.com> --- (In reply to Stefan Hundhammer from comment #17)
That's a good hint. I assume this affects only the currently running kernel and processes, and nothing is written to the device, correct?
I'm not sure I understand the question. Mount is done by currently running kernel, the log replay can do changes to the underlying device. For example there's pending change to file permissions in the log, this is found and the inode updated on device (transaction commit), then mount finishes and the filesystem does not allow any writes. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c22 --- Comment #22 from Jiri Slaby <jslaby@suse.com> --- Any updates on this? Can I drop the revert from the kernel, i.e. restore the upstream behavior of NOT supporting norecovery in btrfs? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c23 --- Comment #23 from Ancor Gonzalez Sosa <ancor@suse.com> --- (In reply to Jiri Slaby from comment #22)
Any updates on this?
My fault. I forgot to put this into the previous development sprint of the YaST Team. I will try to get it moving. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c24 --- Comment #24 from Ancor Gonzalez Sosa <ancor@suse.com> --- BTW, just out of curiosity. How this was detected? I'm asking because this only affects the so-called "offline update", which consists on upgrading a distro to its next version by booting into the installation image and selecting "upgrade" instead of "install". And I don't think that's something anyone would do for Tumbleweed, it's completely contrary to the concept of a rolling distribution. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c25 --- Comment #25 from Jiri Slaby <jslaby@suse.com> --- (In reply to Ancor Gonzalez Sosa from comment #24)
BTW, just out of curiosity. How this was detected?
I'm asking because this only affects the so-called "offline update", which consists on upgrading a distro to its next version by booting into the installation image and selecting "upgrade" instead of "install". And I don't think that's something anyone would do for Tumbleweed, it's completely contrary to the concept of a rolling distribution.
A test in openQA popped this out. So I assume this is explicitly tested, so supported. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c26 --- Comment #26 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #25)
A test in openQA popped this out. So I assume this is explicitly tested, so supported.
One apparently can upgrade from Leap to TW that way... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c27 Daan De Meyer <daan.j.demeyer@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |daan.j.demeyer@gmail.com --- Comment #27 from Daan De Meyer <daan.j.demeyer@gmail.com> --- Note that by removing norecovery you've broken every single scenario where systemd tries to mount a btrfs image: https://github.com/systemd/systemd/pull/32892. What happened to the number one rule of kernel development: "Don't break userspace"? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c28 --- Comment #28 from David Sterba <dsterba@suse.com> --- Nothing happend, it stands and everybody is perfectly aware of that, thanks for asking. Any change that could affect userspace is first announced, deprecation period set and if nobody responds then the option in this case is removed, with a substitute for the functionality. The deprectaion was announced in 5.11, since commit fed8f166ebf3afb a warning is printed in system log. We can't do more, if users or application developers ignore or do not notice such things it's not breaking userspace from us. Deprecation period lasted 3 years (5.11 to 6.8), we can of course make it longer and we do that for other features. You can simply ask for extending it if eg. systemd wants to guarantee some sort of LTS support, but otherwise we remove the things as announced so we don't drag too much deprecated code along. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c29 --- Comment #29 from Arvin Schnell <aschnell@suse.com> --- FYI: Lennart Poettering just complained about the option being removed: https://www.spinics.net/lists/linux-btrfs/msg146338.html -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c30 --- Comment #30 from Jiri Slaby <jslaby@suse.com> --- So this is being fixed in the kernel after all: https://lore.kernel.org/all/44c367eab0f3fbac9567f40da7b274f2125346f3.1716285... I think we should switch back to Kernel component... I took the above patch instead of the revert into the kernel. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c31 --- Comment #31 from Jiri Slaby <jslaby@suse.com> --- (In reply to Jiri Slaby from comment #30)
I took the above patch instead of the revert into the kernel.
(master+stable branches) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1222429 https://bugzilla.suse.com/show_bug.cgi?id=1222429#c32 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|CONFIRMED |RESOLVED Resolution|--- |FIXED --- Comment #32 from Stefan Hundhammer <shundhammer@suse.com> --- Closing according to comment #30. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com