[Bug 1212691] New: amdgpu Polaris11 not recognized during boot
https://bugzilla.suse.com/show_bug.cgi?id=1212691 Bug ID: 1212691 Summary: amdgpu Polaris11 not recognized during boot Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.5 Hardware: x86-64 OS: openSUSE Leap 15.5 Status: NEW Severity: Critical Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: bdamasceno@hotmail.com.br QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- Created attachment 867809 --> https://bugzilla.suse.com/attachment.cgi?id=867809&action=edit Leap 15.5 boot log using 15.5 and 15.4 kernels After upgrading from leap 15.4 to 15.5, the new kernel can´t recognize my Polaris11 AMD GPU anymore. The display works only if I choose the last Leap 15.4 kernel that is still available on grub's menu. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c2 Bruno Damasceno Freire <bdamasceno@hotmail.com.br> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(bdamasceno@hotmai | |l.com.br) | --- Comment #2 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- No firmware updates were offered since I upgraded to Leap 15.5. # env LANGUAGE=en-us zypper info kernel-firmware-amdgpu Loading repository data... Reading installed packages... Information for package kernel-firmware-amdgpu: ----------------------------------------------- Repository : Main Repository Name : kernel-firmware-amdgpu Version : 20230320-150500.1.1 Arch : noarch Vendor : SUSE LLC <https://www.suse.com/> Installed Size : 16,0 MiB Installed : Yes (automatically) Status : up-to-date Source package : kernel-firmware-20230320-150500.1.1.src Upstream URL : https://git.kernel.org/cgit/linux/kernel/git/firmware/linux-firmware.git/ Summary : Kernel firmware files for AMDGPU graphics driver Description : This package contains compressed kernel firmware files for AMDGPU graphics driver. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c4 Bruno Damasceno Freire <bdamasceno@hotmail.com.br> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(bdamasceno@hotmai | |l.com.br) | --- Comment #4 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- Created attachment 868170 --> https://bugzilla.suse.com/attachment.cgi?id=868170&action=edit lsinitrd # env LANGUAGE=en-us lsinitrd /boot/initrd-5.14.21-150500.53-default /lib/firmware/amdgpu/* files (present) /lib/firmware/amdgpu/polaris11_ce.bin (absent) /lib/firmware/amdgpu/polaris11_ce.bin.xz (absent) /lib/firmware/amdgpu/polaris11* files: polaris11_k2_smc.bin.xz polaris11_k_mc.bin.xz polaris11_k_smc.bin.xz polaris11_mc.bin.xz polaris11_me_2.bin.xz polaris11_me.bin.xz polaris11_mec2_2.bin.xz polaris11_mec_2.bin.xz polaris11_mec.bin.xz polaris11_pfp_2.bin.xz polaris11_pfp.bin.xz polaris11_rlc.bin.xz polaris11_sdma1.bin.xz -> polaris10_sdma1.bin.xz polaris11_sdma.bin.xz -> polaris10_sdma.bin.xz polaris11_smc.bin.xz polaris11_smc_sk.bin.xz polaris11_uvd.bin.xz -> polaris10_uvd.bin.xz polaris11_vce.bin.xz -> polaris10_vce.bin.xz -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c5 Takashi Iwai <tiwai@suse.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags| |needinfo?(bdamasceno@hotmai | |l.com.br) --- Comment #5 from Takashi Iwai <tiwai@suse.com> --- Now please check three things: - the output of "modinfo amdgpu"; whether it contains the missing entries for polaris11 - whether you have the actual file in /lib/firmware/* - whether you have enough disk space for /boot If all look OK, then it's likely something wrong with dracut. You can try to add a file /etc/dracut.conf.d/90-polaris11.conf containing the following line: install_items+=" /lib/firmware/amdgpu/polaris11_ce.bin.xz" then rebuild initrd and retest. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c6 Bruno Damasceno Freire <bdamasceno@hotmail.com.br> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(bdamasceno@hotmai | |l.com.br) | --- Comment #6 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- Created attachment 868181 --> https://bugzilla.suse.com/attachment.cgi?id=868181&action=edit detailed info + 90-polaris11.conf result - the output of "modinfo amdgpu"; whether it contains the missing entries for polaris11 (NO) - whether you have the actual file in /lib/firmware/* (YES) # dir /lib/firmware/amdgpu/polaris11_ce* polaris11_ce_2.bin.xz -> polaris10_ce_2.bin.xz polaris11_ce.bin.xz -> polaris10_ce.bin.xz - whether you have enough disk space for /boot (YES) # btrfs fi usage / Overall: Free (estimated): 31.52GiB (min: 31.52GiB) Free (statfs, df): 31.52GiB - add a file /etc/dracut.conf.d/90-polaris11.conf (OK) - rebuild initrd (OK) # dracut --rebuild /boot/initrd-5.14.21-150500.53-default - retest lsinit (OK) # lsinitrd /boot/initrd-5.14.21-150500.53-default | grep -i polaris11_ce lib/firmware/amdgpu/polaris11_ce.bin.xz -> polaris10_ce.bin.xz - retest reboot (OK) The Leap 15.5 with kernel-default-5.14.21-150500.53 booted just fine after adding the /etc/dracut.conf.d/90-polaris11.conf file. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c9 --- Comment #9 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- Created attachment 868183 --> https://bugzilla.suse.com/attachment.cgi?id=868183&action=edit dracut debug log (In reply to Takashi Iwai from comment #8)
Scratch my previous comment, I saws you already gave the info.
But, as far as I see, "modinfo amdgpu" gives the line firmware: amdgpu/polaris11_ce.bin and yet this firmware wasn't included in initrd?
If so, please run dracut with --debug option, and give the whole output.
You got it right. I should have marked this question with an "YES". Sorry. And YES, even listed on "modinfo gpu", it wasn´t included in initrd AFAICS. Dracut log attached. -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c12 Bruno Damasceno Freire <bdamasceno@hotmail.com.br> changed: What |Removed |Added ---------------------------------------------------------------------------- Attachment #868170|0 |1 is obsolete| | Attachment #868183|0 |1 is obsolete| | Flags|needinfo?(bdamasceno@hotmai | |l.com.br) | --- Comment #12 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- Created attachment 868209 --> https://bugzilla.suse.com/attachment.cgi?id=868209&action=edit dracut debug + console capture + some filtering -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c17 --- Comment #17 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- (In reply to Filipe Manana from comment #16)
... Just check if the target directory has nodatacow set (the "C" in lsattr's output), see the changelog.
# btrfs property get /lib compression=zstd:1 # btrfs property get /lib/firmware/amdgpu/polaris10_ce.bin.xz compression=zstd # lsattr /lib/firmware/amdgpu/polaris10_ce.bin.xz --------c------------- /lib/firmware/amdgpu/polaris10_ce.bin.xz # lsattr /var/tmp ---------------C------ /var/tmp/zypp.iy8OXE ---------------C------ /var/tmp/zypp.0yjeP4 ---------------C------ /var/tmp/zypp.BQbptY ---------------C------ /var/tmp/zypp.56R7ao ---------------C------ /var/tmp/systemd-private-e82735bfd4554146a13b8fbd8427a622-systemd-logind.service-1kub9g ---------------C------ /var/tmp/systemd-private-e82735bfd4554146a13b8fbd8427a622-chronyd.service-iJn1dh ---------------C------ /var/tmp/systemd-private-e82735bfd4554146a13b8fbd8427a622-upower.service-cAkj6i ---------------C------ /var/tmp/systemd-private-e82735bfd4554146a13b8fbd8427a622-rtkit-daemon.service-morsdg ---------------C------ /var/tmp/systemd-private-e82735bfd4554146a13b8fbd8427a622-power-profiles-daemon.service-ZEQGKg # env LANGUAGE=en-us lsattr /var/tmp/polaris10_ce.bin lsattr: No such file or directory while trying to stat /var/tmp/polaris10_ce.bin # env LANGUAGE=en-us cp --preserve=mode,xattr,timestamps,ownership /lib/firmware/amdgpu/polaris10_ce.bin.xz /var/tmp cp: setting attribute 'btrfs.compression' for 'btrfs.compression': Invalid argument # lsattr /var/tmp/polaris10_ce.bin ---------------C------ /var/tmp/polaris10_ce.bin.xz -- You are receiving this mail because: You are the assignee for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1212691 https://bugzilla.suse.com/show_bug.cgi?id=1212691#c20 --- Comment #20 from Bruno Damasceno Freire <bdamasceno@hotmail.com.br> --- Created attachment 868386 --> https://bugzilla.suse.com/attachment.cgi?id=868386&action=edit dracut console capture + some filtering (7) (In reply to Goldwyn Rodrigues from comment #19)
Not sure if we can put filesystem specifc code in coreutils/cp.
In the meantime, the workaround is to set environment variable DRACUT_NO_XATTR=true while running dracut.
Please correct me if I am getting anything wrong: My corner case is about getting an existing Leap, with btrfs compression property on /lib and /usr, properly upgraded. The expanded case is initrd being created with missing files because btrfs compression property due incompatible file attrib between system folder and tmp folder. Then these two possibilities came to my mind: 1) Changing the dracut tmp subfolder attrib to COW on btrfs partitions. 2) Scan for btrfs compression property beforehand and set the DRACUT_NO_XATTR=true env variable if appropriate. I don´t know the complexities associated with these suggestions so they are purely speculative brainstorm. About my machine I did some experiments I got these results (more on the attachment): |debug3|debug5|debug6|debug7| console string count --> 'cp ret = 256' | 1111 | 1110 | 697 | 0 | --> 'dracut_install ret = 256' | 1111 | 1110 | 697 | 0 | --> 'install error' | 116 | 115 | 115 | 0 | --> 'dracut-install: ERROR:' | 132 | 132 | 129 | 0 | --> 'missing' | 1632 | 1632 | 1650 | 1650 | --> 'hash hit' | 5901 | 5902 | 5902 | 6009 | --> ' OK' | 243 | 244 | 624 | 1104 | btrfs property compression --> polaris10_ce.bin.xz | x | - | - | - | --> /lib | x | x | - | - | --> /usr | x | x | x | x | other info --> working video | - | x | x | x | --> DRACUT_NO_XATTR=true | - | - | - | x | --> lsinitrd file count | - | 1834 | - | 1911 | So, for now, I will be taking Goldwyn Rodrigues advice and will apply DRACUT_NO_XATTR=true on /etc/environment on my 5 installations (2 Leap, 3 Tumbleweed) since they all will remain with btrfs compression property set on /usr. The btrfs compression property shouldn´t be set on /lib and must have been a leftover from some test. For my "luck" it was this very leftover that triggered this issue. -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@suse.com