Bug ID 1180742
Summary [amdgpu]An AMD Vega series GPU randomly crashes
Classification openSUSE
Product openSUSE Distribution
Version Leap 15.2
Hardware x86-64
OS openSUSE Leap 15.2
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-bugs@opensuse.org
Reporter srid@rkmail.ru
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

Created attachment 844970 [details]
partial kernel log

The AMDGPU kernel driver randomly crashes GPU, usually under load, with Radeon
VII hardware.
The GPU hang is relatively hard to hit, as it usually takes 5 to 7 days before
it crashes.
After a hang it attempts to reset the GPU, but sometimes the reset fails and
system stays sort of unresponsive. You can still access it over network, and
there's some sort of reaction on keyboard events, but display stays dead.
Also, it seems to bring PCIe bus down to 1.0 mode, and it stays that until
reboot.

There's an upstream bug open that may have something to do about it:
https://gitlab.freedesktop.org/drm/amd/-/issues/716

That particular GPU works fine on Windows machine

openSUSE Leap 15.2, kernel 5.3.18-lp152.57-default #1 SMP Fri Dec 4 07:27:58
UTC 2020 (7be5551)


You are receiving this mail because: