[Bug 1205261] New: dracut/hooks/emergency...ESP's FAT serial number in initrd halts boot in dracut emergency shell after rsync migration to new GPT system disk
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261 Bug ID: 1205261 Summary: dracut/hooks/emergency...ESP's FAT serial number in initrd halts boot in dracut emergency shell after rsync migration to new GPT system disk Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Basesystem Assignee: screening-team-bugs@suse.de Reporter: mrmazda@earthlink.net QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 862770 --> http://bugzilla.opensuse.org/attachment.cgi?id=862770&action=edit rdsosreport.txt from TW boot attempt Original Summary: dracut/hooks/emergency...ESP's FAT serial number in initrd halts boot in dracut emergency shell after rsync migration to new GPT system disk Initial state: 1-configured booting & mounting are via LABELs (making UUIDs administratively unimportant, and grub.cfg's own auto-generated stanzas uncommonly necessary or desired), with /etc/grub.d/06_custom causing /boot/grub2/custom.cfg's vmlinuz & initrd symlink stanzas to precede auto-generated entries at boot time. Example: https://forums.opensuse.org/showthread.php/533087-How-to-have-a-custom-UEFI-... 2-in /etc/default/grub: GRUB_DISTRIBUTOR="opensusetw" # TW20221008 last zypper dup 3-multiboot of TW with Leap 15.1, 15.2, 15.3, 15.4, Debian 11 & 12, Ubuntu 20.04 & 22.04 on single NVME 4-only TW installation is configured to touch NVRAM or ESP for writing (e.g., other OSes' fstabs don't mount ESP, and/or they have no bootloader installed) To reproduce: 1-GPT partition new NVME with ESP, swap, and / targets for TW, and / for at least one additional distro 2-format new NVME's matching targets ESP FAT32, swap swap, and / EXT4 for TW 3-"rsync -rlptgoDHAX --exclude 'lost+found'" from old NVME's ESP and / to new ESP and / filesystems for TW 4-appropriately edit volume LABELs on new NVME /boot/grub2/grub.cfg, /boot/grub2/custom.cfg, /etc/fstab 5-repeat create/format/rsync/edit for additional distro(s) 6-remove original NVME 7-try to boot from new NVME Actual behavior: 1-all other distro(s) boot normally (via TW's custom.cfg entries) as if nothing had been changed 2-TW boot halts in dracut emergency shell because the original ESP's FAT serial number cannot be found (see attached rdsosreport.txt) Expected behavior: 1-all distros boot normally (via custom.cfg entries) as if nothing had been changed Notes: 1-Boot is normal since rebuilding of initrds post-migration. 2-I looked for ESP FAT serial references in several non-TW initrds and found none. 3-from lsinitrd of original initrd-5.19.13-default: -rw-r--r-- 1 root root 92 Oct 9 23:10 usr/lib/dracut/hooks/emergency/80-\x2fdev\x2fdisk\x2fby-uuid\x2f20A0-1003.sh 4-# lsblk -f | grep vfat # (current state, not initial state) ������nvme1n1p1 vfat FAT16 PI3P01ESP 20A0-1003 (*original* ESP on 120G NVME) ������nvme0n1p1 vfat FAT32 PNY5P01ESP 4C58-8D7E 294.1M 8% /boot/efi (*new* ESP on 500G NVME) 5-This issue complicates restoring from backups. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c1
--- Comment #1 from Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c3
--- Comment #3 from Felix Miata
I recommend you to create a new conf snippet with a higher number, because this will be overwritten after a dracut update.
I haven't touched it. Wouldn't changing it from "by-uuid" to "by-label" just change the usr/lib/dracut/hooks/emergency failure from a wrong ID to a wrong label? Is this recommendation just an aside for future use? I've now created 13-persistent-local.conf containing persistent_policy="by-label", but I have at least 40 more TW installations on other multiboot PCs subject to disk upgrades and restoring from backup. For / filesystem there is Grub linu line option root= to override initrd, but what is there for whatever this usr/lib/dracut/hooks/emergency is there for? Once a Grub menu selection has been made, there's no need for anything to read the ESP again before init completes (absent any encrypted filesystems), is there? What man page covers usr/lib/dracut/hooks/emergency? man /etc/dracut.conf.d/10-persistent_policy.conf is unhelpful. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c4
--- Comment #4 from Felix Miata
What man page covers usr/lib/dracut/hooks/emergency? man /etc/dracut.conf.d/10-persistent_policy.conf is unhelpful.
Many many hours' work documenting and reporting here, and I just now found man dracut.cmdline. With rd.hostonly=0 the original initrds are usable. :p -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c7
--- Comment #7 from Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c9
Felix Miata
If you add this option in your first boot after restoring the backup and then regenerate the initrd from the new running system, you should not need it anymore.
As you should not need root= anymore, right? :) -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c11
--- Comment #11 from Felix Miata
ToDo: rebuild the RAID TW's initrds and retest.
I tried rebuilding initrds for the RAID installation several times and couldn't get the / filesystem to be found until I chrooted into it and dup'd from 20221008 to 20221205, which rebuilt the latest kernel at least 4 times. First and subsequent boots worked with the the last built whether starting from the ESP on NVME or from original MBR/EXT2 /boot/ filesystem on SATA. Only the first boot failure landed me in a dracut shell. All subsequent to beginning initrd rebuilds simply hung due to unlimited timeout failing to find / on /dev/md3, which turned out to be dracut was excluding 2/3 of the raid lines included in a working initrd. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c12
--- Comment #12 from Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c13
--- Comment #13 from Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c14
--- Comment #14 from Felix Miata
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261
http://bugzilla.opensuse.org/show_bug.cgi?id=1205261#c15
Felix Miata
participants (1)
-
bugzilla_noreply@suse.com