[Bug 960848] New: i915 driver crashes in 4.3.3
http://bugzilla.opensuse.org/show_bug.cgi?id=960848 Bug ID: 960848 Summary: i915 driver crashes in 4.3.3 Classification: openSUSE Product: openSUSE Tumbleweed Version: 2015* Hardware: x86-64 OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Kernel Assignee: kernel-maintainers@forge.provo.novell.com Reporter: danielm@ecoscentric.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- (2nd try, as I had a hard lockup & reboot the last time!) I've just done a major update, a fresh install of Tumbleweed (20160101) on a new SSD. Previously the machine ran 13.1 with typically months of uptime. I'm getting hard lock-ups, followed by a spontaneous reboot. The machine is lightly loaded as I've just been trying out Plasma 5 before fully reconfiguring and remounting user data, ie two Konsoles & FF browsing bug reports. Looking at the journal, I can see numerous errors from the i915 driver. This is the first after boot (28 mins earlier) and then there were similar affecting the other CPUs and finally a hard lockup of the screen and a spontaneous reboot. I can't find any other information in the logs Jan 06 11:42:10 chunk kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request. Jan 06 11:42:10 chunk kernel: [drm:__gen6_gt_wait_for_thread_c0.isra.14 [i915]] *ERROR* GT thread status wait timed out Jan 06 11:42:10 chunk kernel: [drm:fw_domains_get [i915]] *ERROR* render: timed out waiting for forcewake ack request. Jan 06 11:42:10 chunk kernel: [drm:__gen6_gt_wait_for_thread_c0.isra.14 [i915]] *ERROR* GT thread status wait timed out Jan 06 11:42:10 chunk kernel: ------------[ cut here ]------------ Jan 06 11:42:10 chunk kernel: WARNING: CPU: 1 PID: 2445 at ../drivers/gpu/drm/i915/intel_uncore.c:238 __gen6_gt_wait_for_fifo+0xa7/0xb0 [i915]() Jan 06 11:42:10 chunk kernel: WARN_ON(loop < 0 && fifo <= GT_FIFO_NUM_RESERVED_ENTRIES) Jan 06 11:42:10 chunk kernel: Modules linked in: Jan 06 11:42:10 chunk kernel: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs ip6t_REJECT xt_tcpudp nf_conntrack_ipv6 nf_defrag_ipv6 ip6table_raw ipt_REJECT iptable_raw xt_CT iptable_filter ip6table_mangle nf_conntrack_netbios_ns nf_conntrack_broadcast nf_conntrack_ipv4 nf_defrag_ipv4 ip_tables xt_conntrack nf_conntrack ip6table_filter ip6_tables x_tables gspca_sonixj gspca_main videodev snd_hda_codec_hdmi nls_iso8859_1 x86_pkg_temp_thermal intel_powerclamp coretemp nls_cp437 vfat snd_hda_codec_realtek fat kvm_intel snd_hda_codec_generic kvm crct10dif_pclmul snd_hda_intel snd_hda_codec iTCO_wdt snd_hda_core iTCO_vendor_support snd_hwdep ppdev crc32_pclmul snd_pcm aesni_intel aes_x86_64 lrw gf128mul glue_helper osst ablk_helper pcspkr serio_raw joydev lpc_ich Jan 06 11:42:10 chunk kernel: usblp cryptd i2c_i801 st mfd_core snd_timer snd nuvoton_cir rc_core 8250_fintek soundcore mei_me parport_pc mei tpm_tis shpchp parport tpm processor efivarfs hid_generic hid_logitech ff_memless usbhid btrfs xor raid1 md_mod raid6_pq crc32c_intel xhci_pci sr_mod cdrom floppy aic7xxx xhci_hcd r8169 scsi_transport_spi mii i915 video i2c_algo_bit drm_kms_helper syscopyarea sysfillrect sysimgblt ehci_pci fb_sys_fops ehci_hcd drm usbcore usb_common button fjes dm_mirror dm_region_hash dm_log dm_mod sg Jan 06 11:42:10 chunk kernel: CPU: 1 PID: 2445 Comm: plasmashell Not tainted 4.3.3-3-default #1 Jan 06 11:42:10 chunk kernel: Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./H67M-GE/HT, BIOS P1.40 02/18/2011 Jan 06 11:42:10 chunk kernel: ffffffffa02855e0 ffff8800a7c0ba18 ffffffff81376259 ffff8800a7c0ba60 Jan 06 11:42:10 chunk kernel: ffff8800a7c0ba50 ffffffff8107afc2 0000000000000010 0000000000000000 Jan 06 11:42:10 chunk kernel: ffff88017e350000 0000000000000246 000000000001be30 ffff8800a7c0bab0 Jan 06 11:42:10 chunk kernel: Call Trace: Jan 06 11:42:10 chunk kernel: [<ffffffff8101a385>] try_stack_unwind+0x175/0x190 Jan 06 11:42:10 chunk kernel: [<ffffffff810191d9>] dump_trace+0x69/0x3a0 Jan 06 11:42:10 chunk kernel: [<ffffffff8101a3eb>] show_trace_log_lvl+0x4b/0x60 Jan 06 11:42:10 chunk kernel: [<ffffffff8101961c>] show_stack_log_lvl+0x10c/0x180 Jan 06 11:42:10 chunk kernel: [<ffffffff8101a485>] show_stack+0x25/0x50 Jan 06 11:42:10 chunk kernel: [<ffffffff81376259>] dump_stack+0x4b/0x72 Jan 06 11:42:10 chunk kernel: [<ffffffff8107afc2>] warn_slowpath_common+0x82/0xc0 Jan 06 11:42:10 chunk kernel: [<ffffffff8107b04c>] warn_slowpath_fmt+0x4c/0x50 Jan 06 11:42:10 chunk kernel: [<ffffffffa0205c87>] __gen6_gt_wait_for_fifo+0xa7/0xb0 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa0207c34>] gen6_write32+0xd4/0xf0 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01ff7d7>] ring_write_tail+0x27/0x30 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01ff7aa>] __intel_ring_advance+0x3a/0x40 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa0203f06>] gen6_add_request+0xb6/0xd0 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01ebaae>] __i915_add_request+0x8e/0x200 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01e2102>] i915_gem_ringbuffer_submission+0x8c2/0xa90 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01e0f67>] i915_gem_do_execbuffer.isra.26+0xc57/0x11d0 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa01e2672>] i915_gem_execbuffer2+0xb2/0x240 [i915] Jan 06 11:42:10 chunk kernel: [<ffffffffa00b9458>] drm_ioctl+0x138/0x500 [drm] Jan 06 11:42:10 chunk kernel: [<ffffffff81207e25>] do_vfs_ioctl+0x285/0x460 Jan 06 11:42:10 chunk kernel: [<ffffffff81208079>] SyS_ioctl+0x79/0x90 Jan 06 11:42:10 chunk kernel: [<ffffffff81694976>] entry_SYSCALL_64_fastpath+0x16/0x75 Jan 06 11:42:10 chunk kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x75 Jan 06 11:42:10 chunk kernel: Jan 06 11:42:10 chunk kernel: Leftover inexact backtrace: Jan 06 11:42:10 chunk kernel: ---[ end trace 02b9713ab08a8b63 ]--- Jan 06 11:42:10 chunk kernel: ------------[ cut here ]------------ There are no proprietary drivers, third party modules or code not from the installation DVD/stock update repos. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c1
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c2
--- Comment #2 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c3
--- Comment #3 from Takashi Iwai
Thanks for the advice. I'd already updated the kernel to 4.3.3-4-default this morning as it appeared in the TW updates overnight (and I see matches Kernel:Stable), so I've now configured kdump as well. As is typical, no spontaneous reboots to report, yet...
Don't forget to check whether kdump really works beforehand. For example, try echo c > /proc/sysrq-trigger YaST kdump setup tends to give a too tight memory. In doubt, give enough memory in kdump setup.
I also noticed that the system failed to boot, maybe 2/5 of the time. Sometimes it seemed to get stuck before the TW graphic, other times the animated infinity logo keeps flashing but the services don't seem to progress. One time there was a column of fine distorted pixels down the left hand edges of the screen. The only consistent thing is I've noticed is an error to the console (and journal):
[drm:intel_opregion_init [i915]] *ERROR* No ACPI video bus found
This doesn't seem relevant. The ACPI video control is for brightness, usually for laptops or AiO machines. In anyway, it'd be helpful to give more details of your machine, e.g. the output of "hwinfo --all". Also, it'd be better to have a full kernel message including the Oops. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c4
--- Comment #4 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c5
--- Comment #5 from Daniel Morris
Don't forget to check whether kdump really works beforehand. For example, try echo c > /proc/sysrq-trigger YaST kdump setup tends to give a too tight memory. In doubt, give enough memory in kdump setup.
Thanks for the warning! YaST2 had only assigned 116M, which was indeed too small (16GiB installed). 256M finally worked. I've been around the loop quite a few times - according to https://activedoc.opensuse.org/book/opensuse-system-analysis-and-tuning-guid..., crashkernel=512M should have been correct, but that stopped the machine from booting (somewhere around the USB/SCSI probing as btrfs was mounting on the SSD - it would black screen and lock hard, and the only way to look at the trail was videoing it on my mobile). I also tried complying with the warnings that crashkernel=Y@X, appending @16M for the offset, but it whined that it couldn't reserve the memory for the kdump service. Fig 18.1 (YaST2 Kdump) ought to be updated, as it complicates things with a throw back to the early nineties with low & high memory. Maybe I've just had an easy life since then... :)
[drm:intel_opregion_init [i915]] *ERROR* No ACPI video bus found
This doesn't seem relevant. The ACPI video control is for brightness, usually for laptops or AiO machines.
Its a simple desktop, bought to avoid having any dodgy graphics drivers after the never-ending cycle of ATI pain before. Plasma is funny though, showing an empty battery gauge at the login screen, later refreshing to full.
In anyway, it'd be helpful to give more details of your machine, e.g. the output of "hwinfo --all".
Please find attached. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c6
Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c7
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c8
--- Comment #8 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c9
--- Comment #9 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c10
Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c11
--- Comment #11 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c12
--- Comment #12 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c13
--- Comment #13 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c14
--- Comment #14 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c15
--- Comment #15 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c16
--- Comment #16 from Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c17
Daniel Morris
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c18
--- Comment #18 from Takashi Iwai
http://bugzilla.opensuse.org/show_bug.cgi?id=960848
http://bugzilla.opensuse.org/show_bug.cgi?id=960848#c20
Daniel Morris
participants (1)
-
bugzilla_noreply@novell.com