[kernel-bugs] [Bug 1177973] New: amdgpu: kernel panic on boot upon modeset on AMD Renoir APU
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973 Bug ID: 1177973 Summary: amdgpu: kernel panic on boot upon modeset on AMD Renoir APU Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: nospam20@randolf.at QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Created attachment 842864 --> http://bugzilla.opensuse.org/attachment.cgi?id=842864&action=edit screenshot of kernel panic upon booting kernel-default-5.8.14 Hi All, I'm currently on kernel-default-5.8.14-1.2.x86_64 and kernel-firmware...20201005 Upon boot, my system (AMD Ryzen 5 PRO 4650G with Radeon Graphics, Gigabyte B550M AORUS PRO (rev. 1.0) - most recent bios F10, Display DELL U4320Q, 3840x2160 via displayport) frequently hangs with kernel-panic. Statistically, about one out of 5 boot-attempts are successful, sometimes it works on first attempt, sometimes it takes significantly more than 5 attempts, seems random. However, once boot was successful, I can run the computer without any stability issues the entire day. As the machine works nice apart from booting (or with nomodeset to avoid loading amdgpu), I think defective hardware can be ruled out. The only thing that does not work reliably is booting. Unfortunately I seem not to be able to get a direct log from the kernel-panic, I only managed to take a photo of the screen (attached). I also tried kernel:stable as of today: kernel-default-5.9.1-1.1.g8abc535.x86_64 plus the corresponding firmware from the same source - same result: most boot attempts lead to kernel panic. This is NOT a new problem / regression with a specific kernel or tumbleweed version, I'm experiencing these problems since I bought the Ryzen APU (plus mainboard). I did test various kernel versions starting from 5.8.x since about 2 months, all more or less the same behavior. As this is my first bug submission here, please be patient if any required information is missing, I'll try my best to deliver them upon request. Thanks! -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c1
--- Comment #1 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c2
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c3
--- Comment #3 from Bernhard Randolf
Does 5.7.x kernel work? You can find an old kernel package in my OBS home:tiwai:kernel:5.7 repo.
http://download.opensuse.org/repositories/home:/tiwai:/kernel:/5.7/standard/
Yes, it seems like 5.7.12-1.g9c98feb-default from your repo above does work indeed. At least boot worked on first attempt in 4 out of 4 boot-processes now (I tried different scenarios in case this matters, such as reboot, hard reset, power off + power on). Although I know 4 attempts is _not_ enough for statistics - but hey, that's the longest sequence of successful boot-processes I ever had so far on this rig :-) -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c4
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c5
--- Comment #5 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c6
--- Comment #6 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c7
--- Comment #7 from Bernhard Randolf
The build should finish after some time (usually an hour or so), and will appear at http://download.opensuse.org/repositories/home:/tiwai:/bsc1177973/standard/
Up to now, there is no x86_64 subdirectory present at the URL above. Something gone wrong, or did I just not wait long enough yet? output of hwinfo has been attached to this bug already, hope this is in a somehow usable format... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c8
--- Comment #8 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c11
Bernhard Randolf
Never mind, the package is available now on the URL.
Sorry, I did not manage to download & test earlier... kernel-default-5.9.1-1.1.gc31670b.x86_64 boots without trouble (5 out of 5 attempts). Thank You so much! Hero --> Takashi :-) I'm attaching dmesg outputs of 2 boot attempts: attempt #1 does not show the familiar warning originationg from dal_gpio_open_ex - this attempt is most likely one, that would have succeeded with my previously used kernels from kernel:stable-repo attempt #2 does include the dal_gpio_open_ex - warning. this boot was only able to succeed with your magic kernel. Once again, thanks for all your efforts and this terrific support!! -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c12
--- Comment #12 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c13
--- Comment #13 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c14
--- Comment #14 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c15
--- Comment #15 from Bernhard Randolf
Could you check which package does xcmddc belong to? % rpm -qf $(which xcmddc)
xcm-0.5.4-lp152.3.5.x86_64 (I'm back on Leap 15.2 today, using your kernel, but change back to tumbleweed easily if required)
Also, if possible, identify who uses it. This can be udev.
How would I do that? I tried % rpm -q --whatrequires xcm no package requires xcm So I did % zypper rm xcm which did not uninstall anything else (as expected). Reboot succeeded, I did not yet notice anything that did not work as before removing xcm. See dmesg_attempt3_5.9.1-1.gc31670b-default.out which I will upload in a minute. Out of curiosity, I tried to reboot with kernel-default-5.9.0-2.1 which was not yet purged (installed from Kernel:Head last Saturday). This one now also boots without trouble (only tried twice, so far, so this might need further investigation) See dmesg_attempt4_5.9.0-2.gb1f22f7-default.out which will follow asap. So eventually xcmddc might be the culprit in this case. No idea why this is installed on my system, I can not recall I ever installed that manually (which does not mean anything, I don't remember a lot of things I eventually did, people say...) I'll do some more reboots with this 5.9.0 kernel and see if I can reproduce the kernel-panic once again in the mean time... -- You are receiving this mail because: You are the assignee for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c16
--- Comment #16 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c17
--- Comment #17 from Bernhard Randolf
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c18
--- Comment #18 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973
http://bugzilla.opensuse.org/show_bug.cgi?id=1177973#c19
--- Comment #19 from Bernhard Randolf
I guess you might be able to trigger the bug by running like xcmddc --i2c /dev/$I2C --identify where $I2C is i2c-0 or such existing device file. Run the above from multiple places at the same time, and you might see the kernel warning again.
For the record: I did several more (more than 10) reboots with 5.9.0 in absence of the xcm package, 100% of them successful. After reinstalling xcm, first boot with 5.9.0 failed with the familiar kernel-panic. 5.9.1 with your fix still works :-) I was not able to reproduce the warnings with 3 concurrent xcmddc --i2c /dev/i2c-0 --identify calls in a while true-loop from bash with your 5.9.1. kernel (at least I did not find any warnings in dmesg or journal), but maybe "same time" is not that easy to achieve (or i just did not manage to find the warnings...) -- You are receiving this mail because: You are the assignee for the bug.
participants (1)
-
bugzilla_noreply@suse.com