https://bugzilla.suse.com/show_bug.cgi?id=1234320 Bug ID: 1234320 Summary: Integrated AMD GPU randomly resets and crashes desktop Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: vortex@z-ray.de QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- Hey there I have this issue since I own this GPU. It is an integrated AMDGPU of the Ryzen 7 7800X3D. Every now and then the GPU randomly resets causing a desktop freeze followed by a crash and I find myself back in the login screen of gnome with all unsaved work lost.
Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring gfx_0.0.0 timeout, signaled seq=20380, emitted seq=20382 Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: Process information: process Xwayland pid 3428 thread Xwayland:cs0 pid 3429 Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: GPU reset begin! Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: Dumping IP State Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: Dumping IP State Completed Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: MODE2 reset Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: GPU reset succeeded, trying to resume Dez 09 14:05:55 makron kernel: [drm] PCIE GART of 1024M enabled (table at 0x000000F41FC00000). Dez 09 14:05:55 makron kernel: [drm] VRAM is lost due to GPU reset! Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: PSP is resuming... Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: reserve 0xa00000 from 0xf41e000000 for PSP TMR Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: RAS: optional ras ta ucode is not available Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: RAP: optional rap ta ucode is not available Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: SMU is resuming... Dez 09 14:05:55 makron kernel: amdgpu 0000:10:00.0: amdgpu: SMU is resumed successfully! Dez 09 14:05:55 makron kernel: [drm] DMUB hardware initialized: version=0x05001C00 Dez 09 14:05:56 makron kernel: [drm] kiq ring mec 2 pipe 1 q 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring gfx_0.1.0 uses VM inv eng 1 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 4 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 5 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 6 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 7 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 8 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 9 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 10 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 11 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring kiq_0.2.1.0 uses VM inv eng 12 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring sdma0 uses VM inv eng 13 on hub 0 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 8 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 8 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 8 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 8 Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: recover vram bo from shadow start Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: recover vram bo from shadow done Dez 09 14:05:56 makron kernel: amdgpu 0000:10:00.0: amdgpu: GPU reset(2) succeeded! Dez 09 14:05:56 makron kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to initialize parser -125! Dez 09 14:05:56 makron gnome-shell[3428]: amdgpu: The CS has cancelled because the context is lost. This context is innocent.
The GPU itself is called "AMD Radeon Graphics (RADV RAPHAEL_MENDOCINO)" at least acording to the Vulkan info of radv. At this point I am not sure if this is a general amdgpu driver bug better to be reported to the upstream Kernel or not. I attached my full system log since the last boot when the crash happens. Additionally I'd like to state that I run a dual GPU system with an nVidia GPU as secondary GPU. If I plug in all my displays into the nVidia GPU. So the NV GPU drives the whole desktop none of these happens. On other AMD GPUs running (Aeon with recent Kernel) I did not observed this issue. One Being a Radeon RX 7700XT and the Steam Deck APU. But I really like to make use of both GPUs for better power efficiency even though this is a desktop PC. Kind regards, V. -- You are receiving this mail because: You are on the CC list for the bug.