Bug ID | 916014 |
---|---|
Summary | EXT4 on LVM on LUKS on RAID5 doesn't ask for password on boot |
Classification | openSUSE |
Product | openSUSE Distribution |
Version | 13.2 |
Hardware | x86-64 |
OS | openSUSE 13.2 |
Status | NEW |
Severity | Critical |
Priority | P5 - None |
Component | Bootloader |
Assignee | jsrain@suse.com |
Reporter | darren@freemaninstruments.com |
QA Contact | jsrain@suse.com |
Found By | --- |
Blocker | --- |
I have been using LVM on LUKS for a while now. It's great while it's working, but almost always requires some fiddling to get a freshly installed system to boot. Fiddling often entails manually bringing the system far enough up that it can pull in updated packages. For some reason, openSUSE almost always ships broken in this regard, creating a lot of extra work. My laptop has been upgraded from 12.x up to 13.1 and then 13.2. I've managed to get it running fine. But now I'm installing a new file server and I've absolutely had it with trying to get openSUSE to boot. Compared to the laptop, this one adds a RAID5 under the LUKS, although I have an older server running 12.2 in this configuration just fine. Here is a summary of what I've been through over the last couple of days. Six discs, MBR partitioning, all 256 MB part 1 and the rest of the space part 2. Part 1 are all RAID1 together as /boot, part 2 are all RAID5 together to hold the LUKS/LVM. Grub2 installed into /dev/sda MBR. Intend to install it on all of them once it's working. Temporarily, drive 6 is actually external via USB until another drive is sourced. Drive 6 doesn't have the /boot first partition on it. * 13.2 installer asks to unlock LUKS, correctly does so but then doesn't discover the logical volumes inside. Hit "rescan devices", it asks for the password again and then does find the logical volumes. Installation proceeds as normal.. * 13.2 won't boot. It says it is waiting on a certain UUID to appear, which I guess is the logical volume for root. It will never appear. Some clever individual has decided that there should be no time-out, so it waits forever and you don't get a shell. * manually add rd.timeout=60 and rd.shell to the kernel command line. Since we are now using grub2, every change to the command line takes a fair bit of effort from the rescue system. * Now Dracut drops to a shell. But even though I can easily access the logical volume and create a symlink /dev/root that points to /dev/system/root, when I exit the shell it always says "transaction is terminal" or similar. Other systems running Dracut (Arch linux) actually boot at this point, but this one won't. Dracut is an abomination, by the way. How does it work? Nobody knows. * I have tried passing luks=1 luks.uuid= luks.name= to the kernel, no improvement. After much hacking I got it to the point where software raid was no longer getting included in the initramfs. So I tried reinstalling a second time, this time without including the update repo during install, and including the timeout/shell options in the bootloader installation. * Still wouldn't work. Despite being able to chroot to the installed system (from rescue) and run YaST2 for package management and bootloader installation, I just can't get it to either ask for a password to unlock LUKS on boot, or actually do anything further once provided with /dev/root symlink. * It is definitely including MD RAID, LVM and LUKS in the initramfs. From the shell I can set everything up. It's just a matter of issuing the right commands. I note that the related scripts all have the same priority of 90, and I wonder if that's why it can't figure out that they have to run one at a time and in the correct order. * I tried installing 13.1, thinking that maybe the upgrade path is why my laptop works. The installer doesn't have the bug where the first time you unlock LUKS, it doesn't find the logical volumes. It finds them first go. Installation is smooth. * 13.1 doesn't boot. This time, I get the much easier to work with previous init system, which does its iterative 255 loops over hoping to discover the root for several minutes, before asking for the root password two or three times. (Clever - there is no /etc/passwd) Then it gave me a shell anyway. Finally! I thought. I can fix this. * Couldn't run cryptsetup luksOpen - the root filesystem is read-only in the 13.1 rescue environment. So cryptsetup can't write a file it needs and won't perform the action. My God. OpenSUSE used to work most of the time, and be quite easily fixed. This is ridiculous. The 13.x series appear to be incredibly fragile and inadequately tested. I would expect this to be easily reproducible in a VM, it doesn't seem to depend on my hardware. Tomorrow I will try using 13.2 rescue to see if I can repair 13.1. Then I might install 13.2 and try booting with the USB drive failed from the array and disconnected, in case USB is delaying assembly of the array until it's too late to unlock it. But my other server is actually running a RAID4 with the parity drive on USB. So at some point I made this work on 12.2. I remember it being a lot of effort, but that's always been the case. In my experience, openSUSE just never ever boots first go on LVM/LUKS. It's still an OS that's worth fixing after install, assuming it can be done.