[Bug 1189762] New: Kernel 5.3.18-57/59 stuck in early boot, acpi=off helps
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 Bug ID: 1189762 Summary: Kernel 5.3.18-57/59 stuck in early boot, acpi=off helps Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.3 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: pastas4@gmail.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 852006 --> http://bugzilla.opensuse.org/attachment.cgi?id=852006&action=edit hwinfo output After the upgrade from Leap 15.2 to 15.3, none of the kernels boot past the "starting initial ramdisk..." on an Acer Aspire A315-21-49UR. Specifically, I tested these kernels and they get stuck: 5.3.18-57.3-{default,preempt} 5.3.18-59.19.1-{default,preempt} The kernel from 15.2 works well, namely: 5.3.18-lp152.87.1-preempt This is not a secure boot issue because I have it disabled. Also, the kernels above do boot if I enter `acpi=off` in the command line. However, they still don't boot to the point where logs get saved (so I can't see anything with journalctl -b -1), but at least I see some kernel output. Let me know what would help to debug this. It seems a bit similar to the intel_iommu issue, but this laptop is AMD, not Intel. hwinfo output is attached. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c1 Dainius Masiliunas <pastas4@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Summary|Kernel 5.3.18-57/59 stuck |Kernel 5.3.18-57/59 stuck |in early boot, acpi=off |in early boot on AMD |helps |hardware, iommu=soft helps --- Comment #1 from Dainius Masiliunas <pastas4@gmail.com> --- After doing a bunch of trial and error with kernel parameters, it turns out that the culprit is indeed iommu. Obviously the intel_iommu option doesn't work because this is AMD, but the regular iommu option makes a difference. It boots to graphical interface with iommu=off (but then the mouse doesn't work), and it boots correctly (including the mouse) with iommu=soft. No other iommu options (e.g. iommu=calgary) make any difference. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c4 Dainius Masiliunas <pastas4@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(pastas4@gmail.com | |) | --- Comment #4 from Dainius Masiliunas <pastas4@gmail.com> --- @Takashi: Yes, 5.13.12-lp153.7.gd66b4c0-default works correctly without any kernel options! @Joerg: Do you mean without any options on 5.3.18-59.19.1? In that case, no, it literally freezes while showing the GRUB "Starting initial ramdisk..." line, i.e. there is no video output at all. With acpi=off I would be able to see where it freezes, but is that useful? With iommu=off or iommu=soft it does not freeze. With all other iommu options it does not produce any output, just like if I pass no iommu options at all. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c5 --- Comment #5 from Joerg Roedel <jroedel@suse.com> --- (In reply to Dainius Masiliunas from comment #4)
@Joerg: Do you mean without any options on 5.3.18-59.19.1? In that case, no, it literally freezes while showing the GRUB "Starting initial ramdisk..." line, i.e. there is no video output at all. With acpi=off I would be able to see where it freezes, but is that useful? With iommu=off or iommu=soft it does not freeze. With all other iommu options it does not produce any output, just like if I pass no iommu options at all.
When you are in grub you can press 'e' to edit the entry, and in this editor you can remove individual kernel options for testing. There please remove the 'quiet' and 'splash=silent' options. This should show you all boot messages on the screen, up to the freeze. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c6 --- Comment #6 from Dainius Masiliunas <pastas4@gmail.com> --- That's the first thing I did, surely. But no, it freezes so early in boot that it does not display anything at all. It literally shows the GRUB screen with "starting initial ramdisk...", it does not even become black. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c7 --- Comment #7 from Joerg Roedel <jroedel@suse.com> --- (In reply to Dainius Masiliunas from comment #6)
That's the first thing I did, surely. But no, it freezes so early in boot that it does not display anything at all. It literally shows the GRUB screen with "starting initial ramdisk...", it does not even become black.
Does it make a difference if you add 'earlyprintk=vga' in addition to removing the other parameters? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c9 --- Comment #9 from Dainius Masiliunas <pastas4@gmail.com> --- I tried booting with just earlyprintk=vga and with just nomodeset, and with both combined, and the result is still the same, I still just see the GRUB screen. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c10 --- Comment #10 from Dainius Masiliunas <pastas4@gmail.com> --- Created attachment 852078 --> http://bugzilla.opensuse.org/attachment.cgi?id=852078&action=edit Screencap of earlycon=efifb output I now tried with earlycon=efifb, and that gives output! It seems that it works correctly all the way until the earlycon gets disabled and the regular console gets enabled. I also tried with both earlycon=efifb and nomodeset, and the result is the same except that there is an extra warning that this means the graphics drivers will be disabled. So I guess this is in some way related to amdgpu, then? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c11 --- Comment #11 from Joerg Roedel <jroedel@suse.com> --- (In reply to Dainius Masiliunas from comment #10)
Created attachment 852078 [details] Screencap of earlycon=efifb output
I now tried with earlycon=efifb, and that gives output! It seems that it works correctly all the way until the earlycon gets disabled and the regular console gets enabled. I also tried with both earlycon=efifb and nomodeset, and the result is the same except that there is an extra warning that this means the graphics drivers will be disabled. So I guess this is in some way related to amdgpu, then?
If that works, how about 'earlyprintk=efi,keep'. Does that show the output until the actual crash? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c12 --- Comment #12 from Dainius Masiliunas <pastas4@gmail.com> --- Created attachment 852085 --> http://bugzilla.opensuse.org/attachment.cgi?id=852085&action=edit Screencap of earlycon=efifb keep_bootcon Success! Of course, earlyprintk=efi doesn't work because it was removed in kernel 5.1; but instead, using earlycon=efifb and keep_bootcon gives me an output. It shows me that the kernel is stuck in a loop with "bad: scheduling from the idle thread!" and a stack trace. See attached screencap. There is a bit of an overlap due to screen refresh, but it loops, so you can see the same repeated twice on each screen. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1189762 http://bugzilla.opensuse.org/show_bug.cgi?id=1189762#c15 Dainius Masiliunas <pastas4@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(pastas4@gmail.com | |) | --- Comment #15 from Dainius Masiliunas <pastas4@gmail.com> --- Yes, it works! I have booted successfully without any additional options: Linux stoneyridge 5.3.18-1.gb6fd05f-default #1 SMP Mon Aug 30 16:10:55 UTC 2021 (b6fd05f) x86_64 x86_64 x86_64 GNU/Linux -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com