https://bugzilla.suse.com/show_bug.cgi?id=1219444 Bug ID: 1219444 Summary: amdgpu critical error Classification: openSUSE Product: openSUSE Distribution Version: Leap 15.5 Hardware: x86-64 OS: openSUSE Leap 15.5 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-bugs@opensuse.org Reporter: teuniz@protonmail.com QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- Created attachment 872372 --> https://bugzilla.suse.com/attachment.cgi?id=872372&action=edit Output of dmesg The kernel crashes approx every 5 minutes. I reverted back to kernel 5.14.21-150500.55.19-default because with that one it crashes approx once a day. Operating System: openSUSE Leap 15.5 KDE Plasma Version: 5.27.9 KDE Frameworks Version: 5.103.0 Qt Version: 5.15.8 Kernel Version: 5.14.21-150500.55.44-default (64-bit) Graphics Platform: X11 Processors: 32 × 13th Gen Intel Core i9-13900K Memory: 31.0 GiB of RAM Graphics Processor: AMD Radeon Pro W6600 Manufacturer: HP Product Name: HP Z2 Tower G9 Workstation Desktop PC dmesg | grep amdgpu [ 1.540640] [drm] amdgpu kernel modesetting enabled. [ 1.540703] amdgpu: CRAT table not found [ 1.540705] amdgpu: Virtual CRAT table created for CPU [ 1.540712] amdgpu: Topology: Add CPU node [ 1.542670] amdgpu 0000:03:00.0: amdgpu: Fetched VBIOS from VFCT [ 1.542671] amdgpu: ATOM BIOS: 113-D5330400-100 [ 1.542770] amdgpu 0000:03:00.0: vgaarb: deactivate vga console [ 1.542771] amdgpu 0000:03:00.0: amdgpu: Trusted Memory Zone (TMZ) feature disabled as experimental (default) [ 1.542799] amdgpu 0000:03:00.0: amdgpu: VRAM: 8176M 0x0000008000000000 - 0x00000081FEFFFFFF (8176M used) [ 1.542800] amdgpu 0000:03:00.0: amdgpu: GART: 512M 0x0000000000000000 - 0x000000001FFFFFFF [ 1.542801] amdgpu 0000:03:00.0: amdgpu: AGP: 267894784M 0x0000008400000000 - 0x0000FFFFFFFFFFFF [ 1.542845] [drm] amdgpu: 8176M of VRAM memory ready [ 1.542845] [drm] amdgpu: 15892M of GTT memory ready. [ 1.548699] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist [ 1.548704] amdgpu 0000:03:00.0: amdgpu: PSP runtime database doesn't exist [ 2.854516] amdgpu 0000:03:00.0: amdgpu: STB initialized to 2048 entries [ 2.895100] amdgpu 0000:03:00.0: amdgpu: Will use PSP to load VCN firmware [ 3.094413] amdgpu 0000:03:00.0: amdgpu: RAS: optional ras ta ucode is not available [ 3.115717] amdgpu 0000:03:00.0: amdgpu: SECUREDISPLAY: securedisplay ta ucode is not available [ 3.115740] amdgpu 0000:03:00.0: amdgpu: smu driver if version = 0x0000000f, smu fw if version = 0x00000013, smu fw program = 0, version = 0x003b2b00 (59.43.0) [ 3.115745] amdgpu 0000:03:00.0: amdgpu: SMU driver if version not matched [ 3.115777] amdgpu 0000:03:00.0: amdgpu: use vbios provided pptable [ 3.165133] amdgpu 0000:03:00.0: amdgpu: SMU is initialized successfully! [ 3.268063] kfd kfd: amdgpu: Allocated 3969056 bytes on gart [ 3.268478] amdgpu: sdma_bitmap: ffff [ 3.302091] amdgpu: HMM registered 8176MB device memory [ 3.302135] amdgpu: SRAT table not found [ 3.302136] amdgpu: Virtual CRAT table created for GPU [ 3.302599] amdgpu: Topology: Add dGPU node [0x73e3:0x1002] [ 3.302601] kfd kfd: amdgpu: added device 1002:73e3 [ 3.302617] amdgpu 0000:03:00.0: amdgpu: SE 2, SH per SE 2, CU per SH 8, active_cu_number 28 [ 3.302658] amdgpu 0000:03:00.0: amdgpu: ring gfx_0.0.0 uses VM inv eng 0 on hub 0 [ 3.302659] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.0 uses VM inv eng 1 on hub 0 [ 3.302659] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.0 uses VM inv eng 4 on hub 0 [ 3.302660] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.0 uses VM inv eng 5 on hub 0 [ 3.302660] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.0 uses VM inv eng 6 on hub 0 [ 3.302661] amdgpu 0000:03:00.0: amdgpu: ring comp_1.0.1 uses VM inv eng 7 on hub 0 [ 3.302661] amdgpu 0000:03:00.0: amdgpu: ring comp_1.1.1 uses VM inv eng 8 on hub 0 [ 3.302662] amdgpu 0000:03:00.0: amdgpu: ring comp_1.2.1 uses VM inv eng 9 on hub 0 [ 3.302662] amdgpu 0000:03:00.0: amdgpu: ring comp_1.3.1 uses VM inv eng 10 on hub 0 [ 3.302663] amdgpu 0000:03:00.0: amdgpu: ring kiq_2.1.0 uses VM inv eng 11 on hub 0 [ 3.302663] amdgpu 0000:03:00.0: amdgpu: ring sdma0 uses VM inv eng 12 on hub 0 [ 3.302664] amdgpu 0000:03:00.0: amdgpu: ring sdma1 uses VM inv eng 13 on hub 0 [ 3.302665] amdgpu 0000:03:00.0: amdgpu: ring vcn_dec_0 uses VM inv eng 0 on hub 1 [ 3.302665] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.0 uses VM inv eng 1 on hub 1 [ 3.302666] amdgpu 0000:03:00.0: amdgpu: ring vcn_enc_0.1 uses VM inv eng 4 on hub 1 [ 3.302666] amdgpu 0000:03:00.0: amdgpu: ring jpeg_dec uses VM inv eng 5 on hub 1 [ 3.303573] [drm] Initialized amdgpu 3.49.0 20150101 for 0000:03:00.0 on minor 0 [ 3.308709] fbcon: amdgpudrmfb (fb0) is primary device [ 3.505728] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 3.505731] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC) [ 3.505733] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x0000073A [ 3.505733] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: DCEDMC (0x3) [ 3.505734] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 3.505735] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [ 3.505735] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [ 3.505735] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 3.505736] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 3.524299] amdgpu 0000:03:00.0: [drm] fb0: amdgpudrmfb frame buffer device [ 4.537456] snd_hda_intel 0000:03:00.1: bound 0000:03:00.0 (ops amdgpu_dm_audio_component_bind_ops [amdgpu]) [ 5.416287] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 5.416312] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC) [ 5.416319] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x00000000 [ 5.416324] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: unknown (0x0) [ 5.416329] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 5.416333] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x0 [ 5.416336] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x0 [ 5.416340] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x0 [ 5.416343] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 [ 73.156519] amdgpu 0000:03:00.0: amdgpu: [mmhub] page fault (src_id:0 ring:157 vmid:0 pasid:0, for process pid 0 thread pid 0) [ 73.156538] amdgpu 0000:03:00.0: amdgpu: in page starting at address 0x0000000006004000 from client 0x12 (VMC) [ 73.156546] amdgpu 0000:03:00.0: amdgpu: MMVM_L2_PROTECTION_FAULT_STATUS:0x0000073A [ 73.156551] amdgpu 0000:03:00.0: amdgpu: Faulty UTCL2 client ID: DCEDMC (0x3) [ 73.156562] amdgpu 0000:03:00.0: amdgpu: MORE_FAULTS: 0x0 [ 73.156566] amdgpu 0000:03:00.0: amdgpu: WALKER_ERROR: 0x5 [ 73.156570] amdgpu 0000:03:00.0: amdgpu: PERMISSION_FAULTS: 0x3 [ 73.156578] amdgpu 0000:03:00.0: amdgpu: MAPPING_ERROR: 0x1 [ 73.156582] amdgpu 0000:03:00.0: amdgpu: RW: 0x0 uname -a 5.14.21-150500.55.44-default #1 SMP PREEMPT_DYNAMIC Mon Jan 15 10:03:40 UTC 2024 (cc7d8b6) x86_64 x86_64 x86_64 GNU/Linux lspci VGA compatible controller: Advanced Micro Devices, Inc. [AMD/ATI] Navi 23 WKS-XL [Radeon PRO W6600] -- You are receiving this mail because: You are on the CC list for the bug.