LS,
Recently I bought a Radeon RX 6600 XT card from Gigabyte. But I
noticed quite quickly that my system has issues. 95% of the time I
try to wake the system - only the screen is switched off after
sometime, no system suspend - the system input and display
freezes.
I noticed that the drive LED is still working, so assume that the
rest of the system is still working.
Thinking that the PSU might not up-to it's task with this new
card, I upgraded that too. It seemed to go better, but still have
these freezes from time to time.
Below is a snippet from the log file around the time the system
freezes. The GPU seems to have issues, from which the software
does not seem to recover:
----------------------------->
Dec 8 18:51:59 pws1 kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]]
*ERROR* Error waiting for DMUB idle: status=3
Dec 8 18:52:02 pws1 kernel: snd_hda_intel 0000:03:00.1: refused
to change power state from D3hot to D0
Dec 8 18:52:02 pws1 kernel: snd_hda_intel 0000:03:00.1: CORB
reset timeout#2, CORBRP = 65535
Dec 8 18:52:02 pws1 kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]]
*ERROR* Error waiting for DMUB idle: status=3
Dec 8 18:52:02 pws1 kernel: snd_hda_codec_hdmi hdaudioC1D0:
Unable to sync register 0x2f0d00. -5
Dec 8 18:52:02 pws1 rtkit-daemon[6110]: Supervising 7 threads of
4 processes of 1 users.
Dec 8 18:52:02 pws1 rtkit-daemon[6110]: Successfully made thread
23363 of process 6103 owned by 'frans' RT at priority 5.
Dec 8 18:52:02 pws1 rtkit-daemon[6110]: Supervising 8 threads of
4 processes of 1 users.
Dec 8 18:52:05 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Failed
to export SMU metrics table!
Dec 8 18:52:08 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm
not done with your previous command!
Dec 8 18:52:08 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Failed
to export SMU metrics table!
Dec 8 18:52:12 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm
not done with your previous command!
Dec 8 18:52:12 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Failed
to export SMU metrics table!
Dec 8 18:52:12 pws1 kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* ring gfx_0.0.0 timeout, signaled seq=2453204, emitted
seq=2453206
Dec 8 18:52:12 pws1 kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* Process information: process Xorg.bin pid 1860 thread
Xorg.bin:cs0 pid 1882
Dec 8 18:52:12 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
reset begin!
Dec 8 18:52:12 pws1 kernel: clocksource: Switched to clocksource
acpi_pm
Dec 8 18:52:12 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:52:16 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:52:15 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm
not done with your previous command!
Dec 8 18:52:15 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Failed
to disable gfxoff!
Dec 8 18:52:20 pws1 kernel: [drm:dc_dmub_srv_wait_idle [amdgpu]]
*ERROR* Error waiting for DMUB idle: status=3
Dec 8 18:52:29 pws1 kernel: amdgpu 0000:03:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test
failed (-110)
Dec 8 18:52:29 pws1 kernel: [drm:gfx_v10_0_hw_fini [amdgpu]]
*ERROR* KGQ disable failed
Dec 8 18:52:29 pws1 kernel: amdgpu 0000:03:00.0:
[drm:amdgpu_ring_test_helper [amdgpu]] *ERROR* ring kiq_2.1.0 test
failed (-110)
Dec 8 18:52:29 pws1 kernel: [drm:gfx_v10_0_hw_fini [amdgpu]]
*ERROR* KCQ disable failed
Dec 8 18:52:29 pws1 kernel: [drm:gfx_v10_0_hw_fini [amdgpu]]
*ERROR* failed to halt cp gfx
Dec 8 18:52:33 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm
not done with your previous command!
Dec 8 18:52:33 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Failed
to disable smu features.
Dec 8 18:52:33 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Fail to
disable dpm features!
Dec 8 18:52:33 pws1 kernel: [drm:amdgpu_device_ip_suspend_phase2
[amdgpu]] *ERROR* suspend of IP block <smu> failed -62
Dec 8 18:52:33 pws1 kernel: [drm] free PSP TMR buffer
Dec 8 18:52:34 pws1 kernel: [drm] psp gfx command
DESTROY_TMR(0x7) failed and response status is (0x80000306)
Dec 8 18:52:34 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: MODE1
reset
Dec 8 18:52:34 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
mode1 reset
Dec 8 18:52:34 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU smu
mode1 reset
Dec 8 18:52:37 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: SMU: I'm
not done with your previous command!
Dec 8 18:52:37 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
mode1 reset failed
Dec 8 18:52:37 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: ASIC
reset failed with error, -62 for drm dev, 0000:03:00.0
Dec 8 18:52:48 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
reset succeeded, trying to resume
Dec 8 18:52:48 pws1 kernel: [drm] PCIE GART of 512M enabled
(table at 0x00000080005A4000).
Dec 8 18:52:48 pws1 kernel: [drm] VRAM is lost due to GPU reset!
Dec 8 18:52:48 pws1 kernel: [drm] PSP is resuming...
Dec 8 18:52:49 pws1 kernel: [drm] failed to load ucode SMC(0x18)
Dec 8 18:52:49 pws1 kernel: [drm] psp gfx command LOAD_IP_FW(0x6)
failed and response status is (0x80000306)
Dec 8 18:52:49 pws1 kernel: [drm] reserve 0xa00000 from
0x81fe000000 for PSP TMR
Dec 8 18:52:51 pws1 kernel: [drm] psp gfx command
AUTOLOAD_RLC(0x21) failed and response status is (0x0)
Dec 8 18:52:51 pws1 kernel: [drm:psp_load_non_psp_fw [amdgpu]]
*ERROR* Failed to start rlc autoload
Dec 8 18:52:51 pws1 kernel: [drm:psp_resume [amdgpu]] *ERROR* PSP
resume failed
Dec 8 18:52:51 pws1 kernel: [drm:amdgpu_device_fw_loading
[amdgpu]] *ERROR* resume of IP block <psp> failed -22
Dec 8 18:52:51 pws1 kernel: [drm] Skip scheduling IBs!
Dec 8 18:52:52 pws1 kernel[1780]: Last message '[drm] Skip
schedulin' repeated 3 times, suppressed by syslog-ng on
pws1.fransdb.local
Dec 8 18:52:51 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
reset(2) failed
Dec 8 18:52:51 pws1 kernel: [drm] Skip scheduling IBs!
Dec 8 18:52:52 pws1 kernel[1780]: Last message '[drm] Skip
schedulin' repeated 36 times, suppressed by syslog-ng on
pws1.fransdb.local
Dec 8 18:52:51 pws1 kernel: amdgpu_cs_ioctl: 22 callbacks
suppressed
Dec 8 18:52:51 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:52:52 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 5 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:52:51 pws1 kernel: snd_hda_intel 0000:03:00.1: refused
to change power state from D3hot to D0
Dec 8 18:52:51 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:52:52 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 3 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:52:52 pws1 kernel: snd_hda_intel 0000:03:00.1: CORB
reset timeout#2, CORBRP = 65535
Dec 8 18:52:52 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
reset end with ret = -22
Dec 8 18:52:57 pws1 kernel: amdgpu_cs_ioctl: 43 callbacks
suppressed
Dec 8 18:52:57 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:02 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:02 pws1 kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* ring sdma1 timeout, signaled seq=20241, emitted seq=20243
Dec 8 18:53:02 pws1 kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* ring sdma0 timeout, signaled seq=14909, emitted seq=14911
Dec 8 18:53:02 pws1 kernel: [drm:amdgpu_job_timedout [amdgpu]]
*ERROR* Process information: process pid 0 thread pid 0
Dec 8 18:53:02 pws1 kernel[1780]: Last message
'[drm:amdgpu_job_time' repeated 1 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:02 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: GPU
reset begin!
Dec 8 18:53:02 pws1 kernel[1780]: Last message 'amdgpu
0000:03:00.0:' repeated 1 times, suppressed by syslog-ng on
pws1.fransdb.local
Dec 8 18:53:02 pws1 kernel: amdgpu 0000:03:00.0: amdgpu: Bailing
on TDR for s_job:4f11, as another already in progress
Dec 8 18:53:02 pws1 kernel: amdgpu_cs_ioctl: 32 callbacks
suppressed
Dec 8 18:53:02 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:08 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:07 pws1 kernel: amdgpu_cs_ioctl: 33 callbacks
suppressed
Dec 8 18:53:07 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:13 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:12 pws1 kernel: amdgpu_cs_ioctl: 29 callbacks
suppressed
Dec 8 18:53:12 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:18 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:17 pws1 kernel: amdgpu_cs_ioctl: 31 callbacks
suppressed
Dec 8 18:53:17 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:23 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:22 pws1 kernel: amdgpu_cs_ioctl: 32 callbacks
suppressed
Dec 8 18:53:22 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:28 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:28 pws1 kernel: amdgpu_cs_ioctl: 36 callbacks
suppressed
Dec 8 18:53:28 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:33 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:33 pws1 kernel: amdgpu_cs_ioctl: 25 callbacks
suppressed
Dec 8 18:53:33 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
Dec 8 18:53:38 pws1 kernel[1780]: Last message
'[drm:amdgpu_cs_ioctl' repeated 9 times, suppressed by syslog-ng
on pws1.fransdb.local
Dec 8 18:53:38 pws1 kernel: amdgpu_cs_ioctl: 36 callbacks
suppressed
Dec 8 18:53:38 pws1 kernel: [drm:amdgpu_cs_ioctl [amdgpu]]
*ERROR* Failed to initialize parser -125!
<----------------------------
After the last line the system is totally unresponsive.
Does anybody has an Idea whether this due to a driver bug, firmare
bug or something else?
System: Phenom II X4 965, 16 GB, Gigabyte Radeon RX 6600 XT pro
with 4K screen.
Regards, Frans.
--
A: Yes, just like that A: Ja, net zo
Q: Oh, Just like reading a book backwards Q: Oh, net als een boek achterstevoren lezen
A: Because it upsets the natural flow of a story A: Omdat het de natuurlijke gang uit het verhaal haalt
Q: Why is top-posting annoying? Q: Waarom is Top-posting zo irritant?