[Bug 1213222] New: Better warning and error recovery when dracut fails during installation
https://bugzilla.suse.com/show_bug.cgi?id=1213222 Bug ID: 1213222 Summary: Better warning and error recovery when dracut fails during installation Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Enhancement Priority: P5 - None Component: YaST2 Assignee: yast2-maintainers@suse.de Reporter: akruppa@gmail.com QA Contact: jsrain@suse.com Target Milestone: --- Found By: --- Blocker: --- This is a copy-paste of https://forums.opensuse.org/t/boot-problem-after-15-5-tumbleweed-upgrade/167... ----- After upgrading from Leap 15.5 to Tumbleweed, my system failed to boot with a message: No filesystem could mount root, tried: Kernel panic - not syncing: VFS: Unable to mount root fs on unknown-block(8,2) I think I’ve narrowed the problem down to dracut failing during the upgrade process. The relevant lines of output are: dracut-install: Failed to find module ‘atiixp’ dracut[E]: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.X8M8eJ/initramfs -N ^i2o_scsi$ --kerneldir /lib/modules/6.3.7-1-default/ -m pata_atiixp ata_generic fan atiixp ide_pci_generic jbd ext3 edd dracut[F]: installkernel failed in module suse-initrd The /etc/sysconfig/kernel contains the line: INITRD_MODULES=“pata_atiixp ata_generic processor fan ahci atiixp ide_pci_generic jbd ext3 edd” ----- There were stale entries in INITRD_MODULES which caused dracut to fail. My enhancement suggestion is: 1. If dracut fails during installation, state clearly that this leaves the system in an unbootable state. This would help users who are not familiar with the Linux boot process to understand what the problem is. 2. If dracut fails because of missing modules during installation, allow the user to edit the INITRD_MODULES setting and trying dracut again. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c1 --- Comment #1 from Stefan Hundhammer <shundhammer@suse.com> ---
My enhancement suggestion is:
1. If dracut fails during installation, state clearly that this leaves the system in an unbootable state. This would help users who are not familiar with the Linux boot process to understand what the problem is.
I have serious doubts that this would help any user in any way. Something that few users know what it is in the first place failed, so the resulting system won't be able to boot. And then what? What is a user supposed to do? This isn't anything that even advanced users will be able to fix. They might try the installation again, and very likely they will get the same result. We are trying our best not to get the user into that situation in the first place. If that fails, very few users can do anything about it. And the information from such an improved message is the same as you get when the system can't boot: You'll realize that it doesn't boot. Not that you can realistically do anything about it, though. In particular not the kind of users that you mean to target with this: Those who are not familiar with the Linux boot process, as you write. What are they supposed to do in that situation?
2. If dracut fails because of missing modules during installation, allow the user to edit the INITRD_MODULES setting and trying dracut again.
That is clearly a totally overengineered approach that only a miniscule number of users will be able to benefit from. It might rival only the Dracut emergency shell in terms of usability: When that thing appears, I'll simply reinstall. It's not humanly usable IMHO. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c2 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(jreidinger@suse.c | |om) CC| |jreidinger@suse.com --- Comment #2 from Stefan Hundhammer <shundhammer@suse.com> --- Josef, what do you think about this? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c3 --- Comment #3 from Stefan Hundhammer <shundhammer@suse.com> --- BTW also in that forum thread you mention a migration path that I am not sure is supported: From Leap 15.4 to 15.5 to TW. You might get lucky, and it might work; but it might also fail in spectacular ways. And you are now suggesting workarounds for a totally self-made problem. The beauty of Linux and Open Source in general is that you can get creative and do novel things. You can take charge; you are in total command of your system. But that comes at a cost, and that is taking responsibility if it fails. So please don't dump that responsibility into somebody else's lap. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c4 --- Comment #4 from Lukas Ocilka <locilka@suse.com> --- (In reply to Stefan Hundhammer from comment #3)
BTW also in that forum thread you mention a migration path that I am not sure is supported: From Leap 15.4 to 15.5 to TW.
It's usually better to convert Leap to TW than the other way round (should fail), but supported scenarios don't seem to be well-defined IMHO. See https://github.com/yast/skelcd-control-openSUSE/blob/master/control/control.... This states that openQA tests upgrades starting with "from 12.1", which sounds crazy ;) BTW, INTRD_MODULES are not handled by YaST anymore. At least is says so here https://github.com/yast/yast-packager/blob/2d7f52b0c26f6b2e706482d3d3fd84e2c... (old) And this change https://github.com/yast/yast-yast2/pull/1289 (new) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c5 --- Comment #5 from Alexander Kruppa <akruppa@gmail.com> --- (In reply to Stefan Hundhammer from comment #1)
My enhancement suggestion is:
1. If dracut fails during installation, state clearly that this leaves the system in an unbootable state. This would help users who are not familiar with the Linux boot process to understand what the problem is.
I have serious doubts that this would help any user in any way. Something that few users know what it is in the first place failed, so the resulting system won't be able to boot.
It took me a while to realize that dracut failing was the reason for why the system wouldn't boot and I dearly wished that someone (i.e., yast) had told me of this. Once I knew what the problem was, fixing it in a rescue system was relatively straight-forward.
2. If dracut fails because of missing modules during installation, allow the user to edit the INITRD_MODULES setting and trying dracut again.
That is clearly a totally overengineered approach that only a miniscule number of users will be able to benefit from. It might rival only the Dracut emergency shell in terms of usability: When that thing appears, I'll simply reinstall. It's not humanly usable IMHO.
This is fine. I thought it would be convenient to have, but maybe it's too rare a problem to have recovery built into yast. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c6 Josef Reidinger <jreidinger@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |akruppa@gmail.com Flags|needinfo?(jreidinger@suse.c |needinfo?(akruppa@gmail.com |om) |) --- Comment #6 from Josef Reidinger <jreidinger@suse.com> --- Well, if dracut failed it should be properly reported. The second point is a bit too much from my POV. Could you please attach yast logs from failed attempt? So I can see what exactly dracut report and with what exit code? Thanks -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c7 --- Comment #7 from Stefan Hundhammer <shundhammer@suse.com> --- Alexander, this is still waiting for y2logs collected with the supplied "save_y2logs" script. Please notice that you won't need to do the installation again for this: Even if some y2logs rotated out of scope, the installation y2logs are still there in a separate tarball. Please simply attach the whole thing that "save_y2logs" gives you. TIA. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c8 --- Comment #8 from Alexander Kruppa <akruppa@gmail.com> --- I was away. I just tried uploading y2logs (as with filename y2log-Py7jHc.tar.xz), but I keep getting an error message: Malformed multipart POST: data truncated , I can upload the file to some file hoster if you like - which one do you prefer? I have also opened another bug, https://bugzilla.suse.com/show_bug.cgi?id=1212764 , four weeks ago, right after the upgrade to Tumbleweed, and I have uploaded another y2logs in that bug (it worked back then). Maybe the older y2logs contain the info you need? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c9 --- Comment #9 from Stefan Hundhammer <shundhammer@suse.com> --- Please try again with a different browser. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c10 --- Comment #10 from Alexander Kruppa <akruppa@gmail.com> --- Created attachment 869022 --> https://bugzilla.suse.com/attachment.cgi?id=869022&action=edit Yast logs -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c11 --- Comment #11 from Alexander Kruppa <akruppa@gmail.com> --- It worked with Konqeror. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(akruppa@gmail.com | |) | -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c12 --- Comment #12 from Stefan Hundhammer <shundhammer@suse.com> --- From YaST2/y2log of the embedded yast-installation-logs.tar.xz of the attached y2logs tarball: 2023-06-25 16:13:12 <3> install(6473) [Ruby] yast2/execute.rb(rescue in popup_error):235 Execution of command "[["/usr/bin/dracut", "--force", "--regenerate-all"]]" failed. Exit code: 3 dracut[E]: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.CflB8X/initramfs -N ^i2o_scsi$ --kerneldir /lib/modules/5.14.21-150400.24.63-default/ -m pata_atiixp ata_generic fan ahci atiixp ide_pci_generic jbd ext3 edd dracut[F]: installkernel failed in module suse-initrd ... ... dracut[E]: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.HsA9Zf/initramfs -N ^i2o_scsi$ --kerneldir /lib/modules/5.14.21-150500.53-default/ -m pata_atiixp ata_generic fan ahci atiixp ide_pci_generic jbd ext3 edd dracut[F]: installkernel failed in module suse-initrd ... ... dracut[E]: FAILED: /usr/lib/dracut/dracut-install -D /var/tmp/dracut.1L7lUE/initramfs -N ^i2o_scsi$ --kerneldir /lib/modules/6.3.7-1-default/ -m pata_atiixp ata_generic fan atiixp ide_pci_generic jbd ext3 edd dracut[F]: installkernel failed in module suse-initrd -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c13 --- Comment #13 from Stefan Hundhammer <shundhammer@suse.com> --- So there were plenty of errors of that "dracut" call, but I am not sure if they were reported to the user (error popup or similar). Or maybe they were, but that wasn't logged; I am not sure. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c14 --- Comment #14 from Stefan Hundhammer <shundhammer@suse.com> --- In the installation part of those y2logs, I see the storage probing twice, but no storage proposal, no storage actions, no committed storage setup. YaST2 84 % ls -1 storage-inst 01-probed.xml 01-probed.yml 02-probed.xml 02-probed.yml I also don't see the chosen partition /dev/sda2 being formatted; no "mkfs" command was called. In macro_inst_initial.ycp, I see the language selection where you selected English language and German keyboard layout, then a partition selection of /dev/sda2 (?!), then selecting some repos, going into the package selection, disabling secure boot (from the "Installation Summary" / proposal, I guess), and then starting the installation which took about 7 minutes. storage-inst/02-probed.yml: # 2023-06-25 16:07:07 -0400 --- - disk: name: "/dev/sda" size: 976762584 KiB (0.91 TiB) block_size: 0.5 KiB io_size: 0 B min_grain: 1 MiB align_ofs: 0 B partition_table: gpt partitions: - free: size: 1 MiB start: 0 B - partition: size: 1 MiB start: 1 MiB name: "/dev/sda1" type: primary id: unknown - free: size: 51199 MiB (50.00 GiB) start: 2 MiB - partition: size: 4 GiB start: 51201 MiB (50.00 GiB) name: "/dev/sda3" type: primary id: linux file_system: ext4 - partition: size: 4 GiB start: 55297 MiB (54.00 GiB) name: "/dev/sda4" type: primary id: esp file_system: vfat - partition: size: 150315 MiB (146.79 GiB) start: 59393 MiB (58.00 GiB) name: "/dev/sda2" type: primary id: linux file_system: ext4 - partition: size: 744161 MiB (0.71 TiB) start: 209708 MiB (204.79 GiB) name: "/dev/sda5" type: primary id: linux file_system: ext4 - free: size: 728 KiB (0.71 MiB) start: 953869 MiB (0.91 TiB) - disk: name: "/dev/sdb" size: 7814026584 KiB (7.28 TiB) block_size: 0.5 KiB io_size: 0 B min_grain: 1 MiB align_ofs: 0 B partition_table: gpt partitions: - free: size: 1 MiB start: 0 B - partition: size: 4064 GiB (3.97 TiB) start: 1 MiB name: "/dev/sdb1" type: primary id: linux file_system: ext4 - partition: size: 32 GiB start: 4161537 MiB (3.97 TiB) name: "/dev/sdb3" type: primary id: linux file_system: swap - partition: size: 3519058247.5 KiB (3.28 TiB) start: 4194305 MiB (4.00 TiB) name: "/dev/sdb2" type: primary id: linux file_system: ext4 - free: size: 16.5 KiB start: 7814026567.5 KiB (7.28 TiB) - disk: name: "/dev/sdc" size: 29328 MiB (28.64 GiB) block_size: 0.5 KiB io_size: 0 B min_grain: 1 MiB align_ofs: 0 B partition_table: msdos mbr_gap: 1 MiB partitions: - free: size: 132 KiB start: 0 B - partition: size: 4662 KiB (4.55 MiB) start: 132 KiB name: "/dev/sdc1" type: primary id: esp file_system: vfat - partition: size: 223558 KiB (218.32 MiB) start: 4794 KiB (4.68 MiB) name: "/dev/sdc2" type: primary id: '' file_system: iso9660 label: openSUSE-Tumbleweed-NET-x86_64 - free: size: 29105 MiB (28.42 GiB) start: 223 MiB ------------------------------------ The system was detected as efiboot, yet the user chose to disable secure boot from the "Installation Summary" proposal: 2023-06-25 16:11:36 <1> install(6473) [Ruby] installation/proposal_runner.rb(block in input_loop):212 Proposal - UserInput: 'disable_secure_boot' 2023-06-25 16:11:36 <1> install(6473) [Interpreter] installation/proposal_store.rb:295 Calling YaST client bootloader_proposal 2023-06-25 16:11:36 <1> install(6473) [Ruby] installation/proposal_client.rb(run):82 Called Bootloader::ProposalClient.run with AskUser and params {"chosen_id"=>"disable_secure_boot", "has_next"=>false} 2023-06-25 16:11:36 <1> install(6473) [Ruby] bootloader/proposal_client.rb(ask_user):103 ask user called with disable_secure_boot 2023-06-25 16:11:36 <1> install(6473) [Ruby] bootloader/proposal_client.rb(single_click_action):364 single_click_action: option secure_boot, value false -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c15 --- Comment #15 from Stefan Hundhammer <shundhammer@suse.com> --- I was a bit confused about the scenario; in comment #0, it says "upgrade from Leap 15.5 to TW", yet in the y2logs I see
2023-06-25 20:05:54 y2base called with ["installation", "--arg", "initial", "qt", "--noborder", "--auto-fonts", "--fullscreen"]
yet install.inf has Upgrade: 1 AFAICS it was an upgrade from TW to TW, though. 2023-06-25 16:06:50 <1> install(6473) [Ruby] modules/RootPart.rb(CheckPartition):1341 found fstab on /dev/sda2 2023-06-25 16:06:50 <1> install(6473) [Ruby] modules/Misc.rb(SysconfigRead):182 ."/mnt/etc/os-release"."PRETTY_NAME": 'openSUSE Tumbleweed' But at least this explains my confusion about the storage setup and the lack of a "mkfs" call. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c16 --- Comment #16 from Alexander Kruppa <akruppa@gmail.com> --- Initially I upgraded from 15.5 to TW, but when the resulting system would not boot, I ran the TW installation medium again, trying to fix the boot settings (and possibly a third time, I don't remember for certain). Maybe the TW -> TW upgrade refers to the second run? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c17 --- Comment #17 from Stefan Hundhammer <shundhammer@suse.com> --- Well, of course a second installation / upgrade attempt overwrites the logs from the first one. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c18 --- Comment #18 from Alexander Kruppa <akruppa@gmail.com> --- Oof. I didn't know that, I assumed they'd get appended. So using the install medium as a makeshift rescue system is a no-no? What is the recommended way of fixing the initrd without losing the yast logs? I have a retired laptop; I can install 15.5 on it, add a non-existant module to INITRD_MODULES and do the upgrade to TW. If it fails again, which it should, I'll attach the logs of the failed run to this bug. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c19 --- Comment #19 from Stefan Hundhammer <shundhammer@suse.com> --- Thanks, but I don't think that is necessary. After all, even that second log shows dracut failing. I'd be interested to hear if there was any error pop-up about that failure. I don't see anything in the logs. Now the next thing to investigate is of course *why* it failed, and what can be done about it. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c20 --- Comment #20 from Stefan Hundhammer <shundhammer@suse.com> --- During the dracut run, it always complained
dracut-install: Failed to find module 'atiixp'
for each of the kernels just before
dracut[E]: FAILED: /usr/lib/dracut/dracut-install
A lot of modules are installed successfully:
dracut[I]: *** Including module: systemd *** dracut[I]: *** Including module: systemd-initrd *** dracut[I]: *** Including module: i18n *** dracut[I]: *** Including module: drm *** dracut[I]: *** Including module: plymouth *** dracut[I]: *** Including module: kernel-modules *** dracut[I]: *** Including module: kernel-modules-extra *** dracut[I]: *** Including module: resume *** dracut[I]: *** Including module: rootfs-block *** dracut[I]: *** Including module: suse-btrfs *** dracut[I]: *** Including module: suse-xfs *** dracut[I]: *** Including module: terminfo *** dracut[I]: *** Including module: udev-rules *** dracut[I]: *** Including module: dracut-systemd *** dracut[I]: *** Including module: haveged *** dracut[I]: *** Including module: ostree *** dracut[I]: *** Including module: usrmount *** dracut[I]: *** Including module: base *** dracut[I]: *** Including module: fs-lib *** dracut[I]: *** Including module: shutdown *** dracut[I]: *** Including module: suse *** dracut[I]: *** Including module: suse-initrd ***
Each one corresponds to a subdirectory of /usr/lib/dracut/modules.d, e.g. 00systemd, 50plymouth, 95suse-btrfs. But there is indeed none with ??atiixp. The next question is why that was even considered necessary. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c21 --- Comment #21 from Stefan Hundhammer <shundhammer@suse.com> --- install.inf from the y2logs tarball contains this reference to pata_atiixp:
InitrdModules: scsi_dh scsi_dh_alua scsi_dh_emc scsi_dh_rdac pata_atiixp ata_generic cdrom sr_mod usb-common usbcore usbhid st sg thermal iscsi_boot_sysfs
Probably dracut uses 'lsmod' to check what kernel modules are currently loaded (i.e. what hardware the kernel detected), and it detected a chipset that needs the 'pata_atiixp' kernel module, which in turn might require the 'atiixp' module (AMD / ATI hardware?). I can't verify that because I don't have any hardware that needs those kernel modules. But you could check with hwinfo | grep module | grep atiixp If that doesn't give you any results, please attach the output of `sudo hwinfo`. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c22 --- Comment #22 from Stefan Hundhammer <shundhammer@suse.com> --- Also please attach the output of 'lsmod'. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c23 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(akruppa@gmail.com | |) --- Comment #23 from Stefan Hundhammer <shundhammer@suse.com> --- Waiting for feedback; see comment #21 and #22. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c24 --- Comment #24 from Stefan Hundhammer <shundhammer@suse.com> --- Waiting for feedback; see comment #21. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c25 --- Comment #25 from Alexander Kruppa <akruppa@gmail.com> --- (In reply to Stefan Hundhammer from comment #19)
I'd be interested to hear if there was any error pop-up about that failure. I don't see anything in the logs.
I'm back. There was a pop-up which gave the full dracut output, including errors like the dracut-install: Failed to find module 'atiixp' line you already mentioned. What was missing, imho, was a notification from Yast that dracut failing results in an unbootable system. This is probably blindingly obvious for people familiar with the Linux boot process, but it wasn't for me. Ideally it would be nice to point out that the non-existant modules are listed in the INITRD_MODULES variable in /etc/sysconfig/kernel file. Finding out the former was easy, but I think it is worth pointing out during installation for casual users. Finding out the latter took me a long time. The command hwinfo | grep module | grep atiixp outputs nothing on my current system. My current system (AMD Ryzen 5 5600G/B550 chipset-based) is the result of a great many system upgrades. I don't remember when I did the last clean install, but it was certainly many years ago when I had an Intel Skylake-based system - or possibly even on an AMD Phenom-based system. Perhaps the stale entries in INITRD_MODULES originate from those older installations? I still have the SuSE 15.5 installation on a retired SSD, so I can rebuild the old Skylake system and boot it exactly the way it used to be, if that helps finding out what modules etc it used. Please tell if I should do so and what info you'd like to get from the old installation. Mostly what I'm hoping for in this enhancement bug is just a message from Yast, "dracut failed to install an initrd (initial ramdisk), the system is not bootable! Please fix any errors reported by dracut and re-install the bootloader." and information somewhere, from Yast or a knowledge base article or man page etc pointed out by Yast, saying that dracut gets additional modules to install in the initrd from the INITRD_MODULES variable in /etc/sysconfig/kernel. This info used to be available in the man page of mkinitrd-suse, but that program is retired in Tumbleweed, and the dracut man page does not say anything about INITRD_MODULES. Output of lsmod will be attached. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c26 --- Comment #26 from Alexander Kruppa <akruppa@gmail.com> --- Created attachment 869285 --> https://bugzilla.suse.com/attachment.cgi?id=869285&action=edit Output of lsmod from Tumbleweed on AMD 5600G/B550 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c27 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(akruppa@gmail.com | |) | --- Comment #27 from Stefan Hundhammer <shundhammer@suse.com> --- (In reply to Alexander Kruppa from comment #25)
(In reply to Stefan Hundhammer from comment #19)
I'd be interested to hear if there was any error pop-up about that failure. I don't see anything in the logs.
There was a pop-up which gave the full dracut output, including errors like the
dracut-install: Failed to find module 'atiixp'
line you already mentioned.
Okay, that's the most important thing. So the error was reported.
What was missing, imho, was a notification from Yast that dracut failing results in an unbootable system.
This is already where speculations and dracut expert knowledge start. We don't really KNOW that; we can make assumptions. Dracut did something. Maybe it did create an initrd that can be used for at least rudimentary booting. Maybe it didn't create anything. But whatever happened, it's bad, and it's not easy to overcome for any normal user; even advanced users will struggle with it. This is something that simply should not happen. If it happens, things have already gone downhill.
This is probably blindingly obvious for people familiar with the Linux boot process, but it wasn't for me. Ideally it would be nice to point out that the non-existant modules are listed in the INITRD_MODULES variable in /etc/sysconfig/kernel file.
...which is actually no longer true; it was a kludge on top of a kludge, the 'mkinitrd' script that is deprecated and may be completely dropped very soon reading values from /etc/sysconfig/kernel and feeding them to the new dracut mechanisms that work completely differently. We can't provide a dracut tutorial at that point. And we also don't want to, because those things have a tendency to change over time (see mkinitrd), and then the information that we give the user becomes increasingly outdated or maybe even completely wrong. For the brave and adventurous, this is the time to read dracut documentation online or read forums and start experimenting. Less enterprising users might try to reinstall or simply give up. That's sad, but I don't see how we can improve the situation significantly. That situation should simply not happen. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c28 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(jreidinger@suse.c | |om) --- Comment #28 from Stefan Hundhammer <shundhammer@suse.com> --- Josef, any more input from your side? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c29 Josef Reidinger <jreidinger@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(jreidinger@suse.c | |om) | --- Comment #29 from Josef Reidinger <jreidinger@suse.com> --- well, as dracut exit with exit code 3, we should show error popup with saying that it failed and also it prints stderr text. I worry we cannot make there much assumption, but of course if needed, we can improve that popup to be more specific to dracut ( now it is generic command failed popup ). Related code parts: bootloader calling dracut - https://github.com/yast/yast-bootloader/blob/13381cf1d65894af1fdd2b19cf42440... and execute where popup is raised: https://github.com/yast/yast-yast2/blob/c443e4d77fabffbf743259906eb5e3fae723... -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1213222 https://bugzilla.suse.com/show_bug.cgi?id=1213222#c30 Stefan Hundhammer <shundhammer@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |WONTFIX Status|NEW |RESOLVED --- Comment #30 from Stefan Hundhammer <shundhammer@suse.com> --- Since there already is a popup, and I expect users on that level of expertise to have a look at the log file (/var/log/YaST/y2log) where the whole dracut output is logged, I am not sure about the benefit of yet another special handling of yet another rare special case. That's the things the keep eating away our development resources, that add to code bloat and largely dead code that is never executed or tested, so it's pre-programmed bit rot. See also https://bugzilla.suse.com/show_bug.cgi?id=1212560#c10 from yesterday. About /etc/sysconfig/kernel and INITRD_MODULES there, see bug #1212764. We keep accumulating that kind of bit rot anyway; it's not a good idea to add even more artificially. So, after this lengthy discussion and careful consideration, the conclusion is that we are not going to add even more special handling. Alexander, just to clarify: Your input and contribution is appreciated, and this whole discussion was very valuable, even though the result is not what you wished for. But we have to keep the whole YaST and openSUSE project in mind. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com