[Bug 1051060] New: [drm] GPU HANG: ecode 9:0:0x30b1fddf, in X [2110], reason: Hang on render ring, action: reset
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060 Bug ID: 1051060 Summary: [drm] GPU HANG: ecode 9:0:0x30b1fddf, in X [2110], reason: Hang on render ring, action: reset Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.3 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: X.Org Assignee: xorg-maintainer-bugs@forge.provo.novell.com Reporter: matwey.kornilov@gmail.com QA Contact: xorg-maintainer-bugs@forge.provo.novell.com Found By: --- Blocker: --- Created attachment 734214 --> http://bugzilla.opensuse.org/attachment.cgi?id=734214&action=edit /sys/class/drm/card0/error Hello, I am running openSUSE Leap 42.3 with kernel 4.4.76-1-default and shortly after resume from suspend I see the following in dmesg: [ 2089.780499] [drm] GPU HANG: ecode 9:0:0x30b1fddf, in X [2110], reason: Hang on render ring, action: reset [ 2089.780502] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. [ 2089.780503] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel [ 2089.780504] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. [ 2089.780505] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. [ 2089.780506] [drm] GPU crash dump saved to /sys/class/drm/card0/error [ 2089.780570] drm/i915: Resetting chip after gpu hang [ 2089.780665] [drm] RC6 on [ 2089.796281] [drm] GuC firmware load skipped [ 2101.816229] drm/i915: Resetting chip after gpu hang [ 2101.816323] [drm] RC6 on [ 2101.830392] [drm] GuC firmware load skipped -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c1
--- Comment #1 from Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c2
--- Comment #2 from Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c3
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c4
--- Comment #4 from Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c5
--- Comment #5 from Stefan Dirsch
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c6
--- Comment #6 from Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c9
Andreas Stieger
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c10
--- Comment #10 from Stefan Dirsch
Indeed, it uses modesetting with glamoregl. What should be difference comparing to intel_drv.so?
Intel is no longer developing intel_drv.so. It's recommended to use modesetting + glamor for newer Intel GPUs like Kabylake. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c11
Stefan Dirsch
Check if this is a duplicate of bug 1050256.
Indeed this would make sense. Thanks, Andreas! -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c12
--- Comment #12 from Andreas Stieger
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c13
Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c14
--- Comment #14 from Matwey Kornilov
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c15
kolA flash
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c16
--- Comment #16 from kolA flash
I've installed the package. I have not seen the message yet. Even after resume. However, It still hangs from time to time. Unfortunately, I don't know how to debug deeply since I only see frozen broken image on the screen. I've setup kdump, but it is not activated at hangs. I also use softlockup_panic=1 nmi_watchdog=panic,1 in kernel command line.
(In reply to kolA flash from comment #15)
I guess I might found the same bug and reported it here: https://bugs.freedesktop.org/show_bug.cgi?id=101967
[...] installing drm-kmp-default-4.9.33_k4.4.76_1-5.1.x86_64.rpm seems to fix the problem for me.
Looks like the bug was only partially fixed by drm-kmp-default-4.9.33_k4.4.76_1-5.1.x86_64.rpm. Same stuff in the log, but without repeating "Resetting chip after gpu hang" message (only once) and after a few seconds the system is usable again. Maybe similar to what Matwey Kornilov reported in comment #13. See here for details: https://bugs.freedesktop.org/show_bug.cgi?id=101967#c4 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Cédric Heintz
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c17
t neo
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c18
Keks Dose
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c19
--- Comment #19 from t neo
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c20
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Christine Bona
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c21
--- Comment #21 from Keks Dose
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c23
David Walker
If you have Intel GPU with Haswell or older chip, you can try to uninstall drm-kmp to bring back to the 4.4.x kernel code. It's possible that some bugs in 4.9.x kernel (which drm-kmp is based on) still remain for older Intel chips although I already fixed some of them in drm-kmp update.
OTOH, if your chip is Skylake or newer, 4.4.x is definitely buggier than 4.9.x code with drm-kmp, so still keeping drm-kmp is recommended.
I can confirm that removing drm-kmp-default fixed the problems with my Haswell (i5-4210M) chip. Hopefully, the 4.9.x bugs will be resolved, as I have to remove drm-kmp-default when there's a kernel update for 42.3. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c24
--- Comment #24 from David Walker
I can confirm that removing drm-kmp-default fixed the problems with my Haswell (i5-4210M) chip. Hopefully, the 4.9.x bugs will be resolved, as I have to remove drm-kmp-default when there's a kernel update for 42.3.
FYI, I've found that the current Tumbleweed kernel (4.11.8-2-default) and its drm-kmp-default work fine on my laptop. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c25
--- Comment #25 from David Walker
(In reply to David Walker from comment #23)
I can confirm that removing drm-kmp-default fixed the problems with my Haswell (i5-4210M) chip. Hopefully, the 4.9.x bugs will be resolved, as I have to remove drm-kmp-default when there's a kernel update for 42.3.
FYI, I've found that the current Tumbleweed kernel (4.11.8-2-default) and its drm-kmp-default work fine on my laptop.
I spoke too soon. I tried to use my webcam today (using BlueJeans with the Vivaldi browser), and it kept connecting and disconnecting until I removed drm-kmp-default. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c26
--- Comment #26 from Takashi Iwai
(In reply to David Walker from comment #24)
(In reply to David Walker from comment #23)
I can confirm that removing drm-kmp-default fixed the problems with my Haswell (i5-4210M) chip. Hopefully, the 4.9.x bugs will be resolved, as I have to remove drm-kmp-default when there's a kernel update for 42.3.
FYI, I've found that the current Tumbleweed kernel (4.11.8-2-default) and its drm-kmp-default work fine on my laptop.
I spoke too soon. I tried to use my webcam today (using BlueJeans with the Vivaldi browser), and it kept connecting and disconnecting until I removed drm-kmp-default.
Well, it already indicates that your testing is wrong. A KMP on Leap 42.3 is tied only with Leap 42.3 kernel, i.e. it can't influence on other kernel version like TW 4.11 kernel at all. In other words, what you saw (the problem goes away by removing drm-kmp on TW kernel) is a placebo, or some coincidence. The GPU issue can be of course fixed by the later kernel version no matter whether you install drm-kmp or not, it's not surprise. It's just because drm-kmp has no influence on the newer kernel, and the GPU issue that was present on 4.9.x (which is the kernel version drm-kmp is based on) might be already addressed in 4.11.x. In anyway, USB issue is irrelevant with the graphics stack, so it's just a regression in the recent upstream kernel in that regard. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
Christian Trippe
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060
http://bugzilla.opensuse.org/show_bug.cgi?id=1051060#c27
Max Staudt
participants (1)
-
bugzilla_noreply@novell.com