[Bug 1077885] New: GPU hang (Intel Mobile 4 Series Integrated Graphics Controller)

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 Bug ID: 1077885 Summary: GPU hang (Intel Mobile 4 Series Integrated Graphics Controller) Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.3 Hardware: x86-64 OS: openSUSE 42.3 Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: carlos.e.r@opensuse.org QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- This is similar to "Bug 1050256 - GPU hang", but different GPU. The symptoms are the same, but being different GPU I was told to create new report. I have this issue after upgrading my laptop to 42.3 from 42.2, using the offline or DVD upgrade method. CPU: Model: 6.23.10 "Pentium(R) Dual-Core CPU T4300 @ 2.10GHz" Video: Model: "Intel Mobile 4 Series Chipset Integrated Graphics Controller" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x2a42 "Mobile 4 Series Chipset Integrated Graphics Controller" SubVendor: pci 0x103c "Hewlett-Packard Company" SubDevice: pci 0x3069 Revision: 0x07 Driver: "i915" Driver Modules: "i915" (hwinfo output will be attached) Crash log: <3.6> 2018-01-27 12:47:05 minas-tirith systemd 1 - - Started Postfix Mail Transport Agent. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808879] [drm] GPU HANG: ecode 4:0:0xfdefffff, in X [2154], reason: Hang on render ring, action: reset <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808883] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808884] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808884] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808885] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <0.6> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808885] [drm] GPU crash dump saved to /sys/class/drm/card0/error <0.5> 2018-01-27 12:47:17 minas-tirith kernel - - - [ 1128.808914] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 12:47:26 minas-tirith kernel - - - [ 1137.820965] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 12:47:36 minas-tirith kernel - - - [ 1147.820140] drm/i915: Resetting chip after gpu hang I commented this on the openSUSE mail list, and Dave Plater suggested nomodeset. This works, but the video mode changes to something like 800*600, which is pretty bad. He also suggested to reopen this Bugzilla. At that moment I had kernel 4.4.104-39, and drm-kmp-default 4.9.33_k4.4.79_4-5.2. I updated to his version, drm-kmp-default-4.9.33_k4.4.104_39-7.24.x86_64.rpm; this is more stable, but in the end the X environment froze: mouse moves, but no response. I could ctrl-alt-f1. I see in the log several entries like this (different PID), don't know if related: <3.6> 2018-01-27 19:58:34 minas-tirith console-kit-daemon 3128 - - (process:10750): GLib-CRITICAL **: g_slice_set_config: assertion 'sys_page_size == 0' failed I hibernated the machine and went back home. Restored (not restarted) and I see this in the log: <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - System resumed. <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - INFO: running /usr/lib/systemd/system-sleep/grub2.sleep for hibernate <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - INFO: Running grub-once-restore .. <3.6> 2018-01-27 21:16:36 minas-tirith systemd-sleep 10886 - - 2018-01-27 21:16:36+01:00 - Thawing the system now... <3.4> 2018-01-27 21:16:36 minas-tirith systemd-sh - - - Thawing the system now... <3.6> 2018-01-27 21:16:37 minas-tirith systemd 1 - - Stopped Deferred execution scheduler. <3.6> 2018-01-27 21:16:37 minas-tirith systemd 1 - - Started Deferred execution scheduler. <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - Laptop mode <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - enabled, not active [unchanged] <3.6> 2018-01-27 21:16:37 minas-tirith systemd-sleep 10886 - - INFO: Done. <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - Laptop mode <3.6> 2018-01-27 21:16:37 minas-tirith laptop-mode - - - enabled, not active [unchanged] <3.6> 2018-01-27 21:16:37 minas-tirith systemd-sleep 10886 - - tput: No value for $TERM and no -T specified <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816731] [drm] GPU HANG: ecode 4:0:0xfdeffdfb, in X [2171], reason: Hang on render ring, action: reset <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816736] [drm] GPU hangs can indicate a bug anywhere in the entire gfx stack, including userspace. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816736] [drm] Please file a _new_ bug report on bugs.freedesktop.org against DRI -> DRM/Intel <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816737] [drm] drm/i915 developers can then reassign to the right component if it's not a kernel issue. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816737] [drm] The gpu crash dump is required to analyze gpu hangs, so please always attach it. <0.6> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816738] [drm] GPU crash dump saved to /sys/class/drm/card0/error <0.5> 2018-01-27 21:16:48 minas-tirith kernel - - - [13685.816792] drm/i915: Resetting chip after gpu hang <0.5> 2018-01-27 21:17:00 minas-tirith kernel - - - [13697.816112] drm/i915: Resetting chip after gpu hang I will attach gpu.2.log, and messages log since machine upgrade, and hwinfo --cpu and --gfxcard My desktop is XFCE and I have 4 GiB of RAM. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c1 --- Comment #1 from Carlos Robinson <carlos.e.r@opensuse.org> --- Created attachment 757825 --> http://bugzilla.opensuse.org/attachment.cgi?id=757825&action=edit CER: Messages log -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c2 --- Comment #2 from Carlos Robinson <carlos.e.r@opensuse.org> --- Created attachment 757826 --> http://bugzilla.opensuse.org/attachment.cgi?id=757826&action=edit CER: gpu log -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c3 --- Comment #3 from Carlos Robinson <carlos.e.r@opensuse.org> --- Created attachment 757827 --> http://bugzilla.opensuse.org/attachment.cgi?id=757827&action=edit CER: hwinfo output -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c4 --- Comment #4 from Carlos Robinson <carlos.e.r@opensuse.org> --- On suggestion from Felix Miata I add inxi output: minas-tirith:/home/cer/Bugzilla/Bug_1050256 - GPU hang # inxi -c0 -G Graphics: Card: Intel Mobile 4 Series Integrated Graphics Controller Display Server: X.org 1.18.3 drivers: intel (unloaded: modesetting,fbdev,vesa) tty size: 150x51 Advanced Data: N/A for root minas-tirith:/home/cer/Bugzilla/Bug_1050256 - GPU hang # -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c5 --- Comment #5 from Carlos Robinson <carlos.e.r@opensuse.org> --- On suggestion from Stefan Dirsch I have uninstalled drm-kmp-default, I will see what happens. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c6 Carlos Robinson <carlos.e.r@opensuse.org> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo?(carlos.e.r@opensu | |se.org) | --- Comment #6 from Carlos Robinson <carlos.e.r@opensuse.org> --- I see a needinfo from me, but I don't see the question. :-? Clearing. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c8 --- Comment #8 from Carlos Robinson <carlos.e.r@opensuse.org> --- Ah, ok :-) So far, no crashes (I left the machine running all night while I slept, and the display artefacts have disappeared. I will now hibernate and restore the machine, this usually causes some stress. [...] Restored fine, it seems. I can try rebooting with reduced memory. [...] Ok, did so, booted with 1G, opened thunderbird and firefox, machine was swapping about another gig, alt-tabbed, switched workspaces, and no artifacts, no crashes. So this machine should run without drm-kmp-default always? Or a patch is needed? -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c9 --- Comment #9 from Stefan Dirsch <sndirsch@suse.com> --- (In reply to Carlos Robinson from comment #8)
So this machine should run without drm-kmp-default always? Or a patch is needed?
Yes, that's probably best. In addition you can try KOTD to see if the issue has been fixed upstream meanwhile. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c10 --- Comment #10 from Carlos Robinson <carlos.e.r@opensuse.org> --- Well, I'll see if I can. Means also installing corresponding drm-kmp- too, I guess. I also have to try installing Leap 15.0 in a test partition and report. Thanks. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c11 --- Comment #11 from Stefan Dirsch <sndirsch@suse.com> --- (In reply to Carlos Robinson from comment #10)
Well, I'll see if I can. Means also installing corresponding drm-kmp- too, I guess.
Oh no. *Un*installing, please!
I also have to try installing Leap 15.0 in a test partition and report.
That's also useful. Thanks! -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c12 --- Comment #12 from Carlos Robinson <carlos.e.r@opensuse.org> --- I don't understand. The crash doesn't happen unless I install drm-kmp, there will be no way to know when the kernel solves the issue. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c13 --- Comment #13 from Stefan Dirsch <sndirsch@suse.com> --- ? drm-kmp means DRM drivers from Kernel 4.9. I would like to know whether newer Kernels 4.14/4.15 refix the issue. We know DRM of Kernel 4.4 still worked. -- You are receiving this mail because: You are on the CC list for the bug.

http://bugzilla.opensuse.org/show_bug.cgi?id=1077885 http://bugzilla.opensuse.org/show_bug.cgi?id=1077885#c14 Pawel Dziekonski <pawel.dziekonski@wcss.pl> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |pawel.dziekonski@wcss.pl --- Comment #14 from Pawel Dziekonski <pawel.dziekonski@wcss.pl> --- I have exactly the same problem after update to 43.2 Device Name: "Onboard IGD" Model: "Intel Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller" Vendor: pci 0x8086 "Intel Corporation" Device: pci 0x0412 "Xeon E3-1200 v3/4th Gen Core Processor Integrated Graphics Controller" SubVendor: pci 0x1462 "Micro-Star International Co., Ltd. [MSI]" SubDevice: pci 0x7817 Revision: 0x06 Driver: "i915" Driver Modules: "drm" CPU: Intel(R) Core(TM) i5-4460 CPU @ 3.20GHz uname -r 4.4.104-39-default The only way to overcome this is to zypper addlock drm-kmp-default :( -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com