[Bug 1087525] New: System won't start after kernel 4.12.14-lp150.7.2 update
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 Bug ID: 1087525 Summary: System won't start after kernel 4.12.14-lp150.7.2 update Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.0 Hardware: x86-64 OS: Other Status: NEW Severity: Major Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: alpvonkri@protonmail.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Initially I reported bug 1086742 AMD graphics don't work after last Kernel update After last 4.12.14-lp150.7.2 update (bug 1086742 fix) system won't start at all, it will block after GRUB. Previously I was able to boot with Kernel 4.12.14-lp150.5.3 and have a full visual working system. But now even Kernel 4.12.14-lp150.5.3 won't start, I need to boot with 4.12.14-lp150.6.4 but that means my system will fallback to basic 800x600 display. * System: AMD A10-8700P Radeon R6, 10 Compute Cores 4C+6G @ 4x 1.8GHz * Graphics card: 00:01.0 VGA compatible controller [0300]: Advanced Micro Devices, Inc. [AMD/ATI] Carrizo [1002:9874] (rev c5) Subsystem: Hewlett-Packard Company Device [103c:80b5] Kernel driver in use: amdgpu Kernel modules: amdgpu * Attached: Videos of the two mentioned not working Kernels booting Let me know if more info is needed. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c1 --- Comment #1 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765477 --> http://bugzilla.opensuse.org/attachment.cgi?id=765477&action=edit 4.12.14-lp150.5.3 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c2 --- Comment #2 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Comment on attachment 765477 --> http://bugzilla.opensuse.org/attachment.cgi?id=765477 4.12.14-lp150.5.3 This applies to Kernels: 4.12.14-lp150.5.3 4.12.14-lp150.7.2 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c3 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |tiwai@suse.com --- Comment #3 from Takashi Iwai <tiwai@suse.com> --- Can be a problem of graphics driver. What "worked" in the past lp150.5.x package was a side-effect of the wrong kconfig, thus the amdgpu graphics driver was gone from kernel-default. Try to boot with "nomodeset" boot option. This will suppress the amdgpu KMS driver, so it'll remain in the VESA or EFI framebuffer. Also, try to remove "quiet" boot option, too. If the nomodeset boot option doesn't help, forget the test below. It's something else, and we need different analysis. Or, if nomodeset worked, keep booting with nomodeset option, then after boot up, go to Linux console (ctrl-alt-F1), and re-load amdgpu driver like modprobe -r amdgpu modprobe amdgpu modeset=1 This might or might not crash. If it crashes, try alt-sysrq-S and alt-sysrq-B to make system sync and rebooting by itself (with luck). Then check the previous kernel log via "journalctl -k -b-1" as root. We might see some messages. If you have another machine on the network, you may set up netconsole to catch the kernel messages remotely. It may work better than the above. And, if it's a crash (or kernel panic), at best, try to get the kernel messages via kdump. The next step would be to do the same like the above, reload of amdgpu, but one more step in-between: modprobe -r amdgpu modprobe drm debug=0x0e modprobe amdgpu modeset=1 This will give more verbose information about graphics in the kernel messages. Once when you get the kernel message showing some crash, please upload it to Bugzilla. And, also give the output of hwinfo, too. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c4 --- Comment #4 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765490 --> http://bugzilla.opensuse.org/attachment.cgi?id=765490&action=edit Images, hwinfo, journalctl -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c5 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |alpvonkri@protonmail.com Flags| |needinfo?(alpvonkri@protonm | |ail.com) --- Comment #5 from Takashi Iwai <tiwai@suse.com> --- You need to pass modeset=1 option to amdgpu driver, not nomodeset=1. modprobe amdgpu modeset=1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c6 Marco Antonio Flores <alpvonkri@protonmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(alpvonkri@protonm | |ail.com) | --- Comment #6 from Marco Antonio Flores <alpvonkri@protonmail.com> --- I made a mistake with that picture, but: modprobe -r amdgpu modprobe amdgpu modeset=1 Returns the same result as using: modprobe -r amdgpu modprobe amdgpu nomodeset=1 * Image attached. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c7 --- Comment #7 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765513 --> http://bugzilla.opensuse.org/attachment.cgi?id=765513&action=edit modeset=1 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c8 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(alpvonkri@protonm | |ail.com) --- Comment #8 from Takashi Iwai <tiwai@suse.com> --- Damn, it misses that option yet. I need to patch and send to upstream... OK, then try to blacklist this module at first. Add the following line to /etc/modprobe.d/99-local.conf blacklist amdgpu And run mkinitrd to rebuild initrd, and reboot without "nomodeset" option, and without "quiet" option, but with "3" option. This will avoid starting X. Then try to load amdgpu driver: modprobe drm debug=0x0e modprobe amdgpu Also, try a newer kernel, e.g. the kernel from OBS Kernel:stable repo. If this works with amdgpu driver, we might be able to backport something. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c9 Marco Antonio Flores <alpvonkri@protonmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(alpvonkri@protonm | |ail.com) | --- Comment #9 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765618 --> http://bugzilla.opensuse.org/attachment.cgi?id=765618&action=edit blacklist Steps: 1) Edit /etc/modprobe.d/99-local.conf 2) Run: mkinitrd 3) Reboot 4) Edit GRUB entry 5) Boot 6) Run: modprobe drm debug=0x0e 7) Run: modprobe amdgpu 8) Run: startx Result: Working system Notes: I didn't install a newer Kernel from OBS, do you still need me to do it? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c10 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(alpvonkri@protonm | |ail.com) --- Comment #10 from Takashi Iwai <tiwai@suse.com> --- Hrm, so it works now? That's surprising. Let's concentrate on the Leap 15.0 kernel, then. OK, now could you just boot straight without nomodeset but keeping blacklist? Does it boot with the native graphics? If yes, try to remove blacklist, mkinitrd, and boot without nomodeset. Does it break again? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c11 Marco Antonio Flores <alpvonkri@protonmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(alpvonkri@protonm | |ail.com) | --- Comment #11 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Yes, booting without "nomodeset" and keeping blacklist does boot with native graphics 800x600 display; after that if I do "modprobe amdgpu" then it gives me full working display. Yes, removing blacklist, doing a "mkinitrd", booting without "nomodeset" breaks my system again. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c12 --- Comment #12 from Takashi Iwai <tiwai@suse.com> --- OK, then let's try something different: - Remove blacklist again, run mkinitrd to rebuild initrd - Reboot without nomodeset, but also remove quiet option. And add plymouth.enable=0 drm.debug=0x0e boot options, too. If this makes the system stalled, check whether it's really a crash or it's just a blank screen. It'd be best if you can remote-access to this machine. If not, maybe try like: ctrl-alt-F1 ctrl-alt-delete to reboot. If the machine reacts with the sequence, it means that it's alive. Then you can see journalctl -b-1 to get the previous log. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c13 --- Comment #13 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765634 --> http://bugzilla.opensuse.org/attachment.cgi?id=765634&action=edit Test 3 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c14 --- Comment #14 from Takashi Iwai <tiwai@suse.com> --- The kernel log contains: Mar 31 01:57:43 15-ab126la kernel: amdgpu 0000:00:01.0: Direct firmware load for amdgpu/carrizo_pfp.bin failed with error -2 Do you happen to have installed amdgpu-pro package? It's known to break things. Try to uninstall it if it was there. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c15 --- Comment #15 from Marco Antonio Flores <alpvonkri@protonmail.com> --- I had it, when first started to fail I installed it to see if it worked, but as it didn't I uninstalled, don't have it anymore. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c16 --- Comment #16 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765639 --> http://bugzilla.opensuse.org/attachment.cgi?id=765639&action=edit amdgpu This is what I currently have about amdgpu -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c17 --- Comment #17 from Takashi Iwai <tiwai@suse.com> --- Hm, maybe it left something bad? Check the files in /etc/dracut.conf.d/*. Also, do you have kernel-firmware package installed? -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c18 --- Comment #18 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765640 --> http://bugzilla.opensuse.org/attachment.cgi?id=765640&action=edit dracut firmware kernel xorg mesa 1) /etc/dracut.conf.d/ 2) Kernel packages 3) Firmware packages 4) Mesa packages 5) XOrg packages -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c19 --- Comment #19 from Takashi Iwai <tiwai@suse.com> --- dracuf.conf.d contains the following: -rw-r--r-- 1 root root 94 mar 24 21:41 /etc/dracut.conf.d/amdgpu-4.12.14-lp150.5-default.conf -rw-r--r-- 1 root root 94 mar 29 14:46 /etc/dracut.conf.d/amdgpu-4.12.14-lp150.6-default.conf Who owns these files? Check via rpm -qf /etc/dracut.conf.d/amdgpu*.conf Then remove these files, run mkinitrd and retest. This should make things working again. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c20 --- Comment #20 from Marco Antonio Flores <alpvonkri@protonmail.com> --- Created attachment 765641 --> http://bugzilla.opensuse.org/attachment.cgi?id=765641&action=edit Output 1) Delete: files 2) Edit: blacklist 3) Run: mkinitrd Yes, now it works. As rpm command output says, I don't know where they came from, nor why do they only seemed to be for kernel 5 and 6, but not 7 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1087525 http://bugzilla.opensuse.org/show_bug.cgi?id=1087525#c21 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |INVALID --- Comment #21 from Takashi Iwai <tiwai@suse.com> --- (In reply to Marco Antonio Flores from comment #20)
Created attachment 765641 [details] Output
1) Delete: files 2) Edit: blacklist 3) Run: mkinitrd
Yes, now it works. As rpm command output says, I don't know where they came from, nor why do they only seemed to be for kernel 5 and 6, but not 7
Likely some leftover of amdgpu package (DKMS?) you've installed. So it's a bug in 3rd party package. Let's close. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com