[Bug 912170] New: Boot fails with BTRFS RAID1 array as /home - open ctree failed
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 Bug ID: 912170 Summary: Boot fails with BTRFS RAID1 array as /home - open ctree failed Classification: openSUSE Product: openSUSE Distribution Version: 13.2 Hardware: 64bit OS: openSUSE 13.2 Status: NEW Severity: Major Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: RyanSKingsbury@alumni.utexas.net QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- I recently did a clean install of OpenSUSE 13.2 into a system with three drives - 1 SSD containing the / partition (btrfs format with default subvolumes), and 2 identical HDD's containing /home in an encrypted BTRFS RAID1 array. Initially, I was using only one of the HDD's as /home (still encrypted BTRFS). Once I added the second drive and started the RAID1 mirroring, my system will no longer boot without manual intervention. I get dropped into "emergency mode", but then all I have to do is exit (Ctrl+D) to continue booting and everything works normally. The last entry in the system log before Emergency Mode is invoked is: Code: BTRFS: open_ctree failed It appears that a similar (maybe the same) bug was reported on ArchWiki: https://wiki.archlinux.org/index.php/Btrfs#BTRFS:_open_ctree_failed But the solution refers to mkinitcpio rather than dracut that OpenSUSE uses, so I'm unsure how to apply it to my system. Some relevant system information: excerpt from /etc/fstab: ---- # encrypted RAID1 array containing /home UUID=e91f611f-524a-43f5-bde5-8ebb9672f146 /home btrfs defaults 0 0 ---- /etc/crypttab: ---- encrypted-home-sdb UUID=9c8fb7d0-74e2-4e38-b7c7-6211bbb6d2b1 none luks, retry=1 encrypted-home-sdc UUID=b82e5894-cea2-4fe7-a0a5-80f918e9db61 none luks, retry=1 ---- excerpt from fdisk -l: ---- Device Start End Sectors Size Type /dev/sdb1 2048 1953523711 1953521664 931.5G Microsoft basic data Disk /dev/sdc: 931.5 GiB, 1000204886016 bytes, 1953525168 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disklabel type: gpt Disk identifier: CAC38602-D5A5-4F28-8344-65B559124514 Device Start End Sectors Size Type /dev/sdc1 2048 1953525134 1953523087 931.5G Linux filesystem Disk /dev/mapper/encrypted-home-sdc: 931.5 GiB, 1000201723392 bytes, 1953518991 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes Disk /dev/mapper/encrypted-home-sdb: 931.5 GiB, 1000200994816 bytes, 1953517568 sectors Units: sectors of 1 * 512 = 512 bytes Sector size (logical/physical): 512 bytes / 4096 bytes I/O size (minimum/optimal): 4096 bytes / 4096 bytes ---- Device UUID's: ---- /dev/sda1: UUID="04e20317-f554-43cf-a95d-b0385ebd22bb" UUID_SUB="cb9dc431-1855-466b-ae9a-5c0ddcd7f95a" TYPE="btrfs" PTTYPE="dos" PARTLABEL="primary" PARTUUID="860c637a-83c8-4399-8dc4-5b5ab66c0f06" /dev/sdb1: UUID="9c8fb7d0-74e2-4e38-b7c7-6211bbb6d2b1" TYPE="crypto_LUKS" PARTLABEL="primary" PARTUUID="532a7bd2-6774-422e-90c0-31f19796098d" /dev/sdc1: UUID="1ca9d3ba-c409-4127-91f5-e3d9c21242bd" TYPE="crypto_LUKS" PARTLABEL="Linux filesystem" PARTUUID="5a6830cc-9064-476c-aa79-6189dd0964bd" /dev/mapper/encrypted-home-sdc: UUID="e91f611f-524a-43f5-bde5-8ebb9672f146" UUID_SUB="6007c4c7-5119-4b0f-ab21-134832c4774c" TYPE="btrfs" /dev/mapper/encrypted-home-sdb: UUID="e91f611f-524a-43f5-bde5-8ebb9672f146" UUID_SUB="020bae26-b64f-413e-8263-a6e832ccb224" TYPE="btrfs" ---- -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 Ryan Kingsbury <RyanSKingsbury@alumni.utexas.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |RyanSKingsbury@alumni.utexa | |s.net -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 --- Comment #1 from Ryan Kingsbury <RyanSKingsbury@alumni.utexas.net> --- It was suggested in the forums that I try: echo 'c /dev/btrfs-control 0660 root root - 10 234' > /etc/tmpfiles.d/btrfs-control.conf I issued this command as root, then rebooted, but the problem still occurred. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 Andrei Borzenkov <arvidjaar@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |arvidjaar@gmail.com, | |jeffm@suse.com Component|Kernel |Basesystem Assignee|kernel-maintainers@forge.pr |systemd-maintainers@suse.de |ovo.novell.com | --- Comment #2 from Andrei Borzenkov <arvidjaar@gmail.com> --- Yes, I can reproduce it. Setup - two encrypted containers and btrfs on them as raid0 for both data and metadata. The bug is in patch 1060-udev-use-device-mapper-target-name-for-btrfs-device-ready.patch. When btrfs builtin runs, /dev/mapper link is not yet created so builtin fails. This results in SYSTEMD_READY being not set and systemd attempts immediately mount multi-device filesystem without second device being available. 1392 open("/dev/btrfs-control", O_RDWR|O_CLOEXEC) = 7 1392 ioctl(7, BTRFS_IOC_DEVICES_READY, 0x7ffff80a3520) = -1 ENOENT (No such file or directory) Cc'ing Jeff who was the author of patch. I am not sure what the statement --><-- If the device is a DM device, udev will have already cached the table name from sysfs and we can use that to pass /dev/mapper/<name> to the builtin so that the correct name is used. --><-- is based on - as far as I can tell, udev is passing argument verbatim, without doing any processing. Nor do I understand what problem was supposed to be fixed here. @Ryan - as a workaround, copy /usr/lib/udev/rules.d/64-btrfs.rules into /etc/udev/rules.d/64-btrfs.rules and replace the following two lines ENV{DM_NAME}=="", IMPORT{builtin}="btrfs ready $devnode" ENV{DM_NAME}=="?*", IMPORT{builtin}="btrfs ready /dev/mapper/$env{DM_NAME}" with single one IMPORT{builtin}="btrfs ready $devnode" If you will have *any* issues, please mention them here. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 --- Comment #3 from Ryan Kingsbury <RyanSKingsbury@alumni.utexas.net> --- Andrei, THANK YOU! Your fix worked. Will this workaround survive future updates to the kernel or udev? I'll add one other observation that could be relevant. Before adding the second disk (when everything was working), the plymouth screen behind the prompt for the encryption password would load very quickly, fading from black to the teal background in 1-2 seconds. Once both disks were added to the array, the plymouth screen took almost 10 seconds to complete this transition. Could that indicate some kind of performance issue related to this? Note that the slow loading behavior persists even after I applied your fix. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 --- Comment #4 from Andrei Borzenkov <arvidjaar@gmail.com> --- (In reply to Ryan Kingsbury from comment #3)
Andrei,
THANK YOU! Your fix worked. Will this workaround survive future updates to the kernel or udev?
Yes, it will; but it makes sense to compare updated file and merge back any other change. @systemd-maintaners: note that this causes failure to mount btrfs even *MANUALLY*. Because patch effectively skips calling "btrfs ready" for every device-mapper device, when you run "mount /dev/xxx" kernel knows only about one single device /dev/xxx and fails mount. User needs to either run "btrfs scan" before or repeat mount for every device that is part of multi-device btrfs. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 http://bugzilla.opensuse.org/show_bug.cgi?id=912170#c48 Franck Bui <fbui@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fbui@suse.com Flags| |needinfo?(jeffm@suse.com) --- Comment #48 from Franck Bui <fbui@suse.com> --- Jeff could you propagate the change to 13.1 as well so we can use the same systemd for both 13.1 and 13.2 ? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 Dr. Werner Fink <werner@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |systemd-maintainers@suse.de -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 http://bugzilla.opensuse.org/show_bug.cgi?id=912170#c50 --- Comment #50 from Jeff Mahoney <jeffm@suse.com> --- I've propagated the packaging changes to openSUSE 13.1, 13.2, and Leap 42.1 so that they're the same. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 http://bugzilla.opensuse.org/show_bug.cgi?id=912170#c51 --- Comment #51 from Jeff Mahoney <jeffm@suse.com> --- *** Bug 1000366 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=912170 http://bugzilla.opensuse.org/show_bug.cgi?id=912170#c58 Andreas Stieger <astieger@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |IN_PROGRESS CC| |astieger@suse.com --- Comment #58 from Andreas Stieger <astieger@suse.com> --- Incident for 13.2 is running. Packages will appear in the test repositories below. Please test. http://download.opensuse.org/repositories/openSUSE:/Maintenance:/5806/ http://download.opensuse.org/update/13.2-test/ -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com