[Bug 965125] New: kernel crashes with divide error: 0000 in e1000e driver

4 Feb 2016

      http://bugzilla.suse.com/show_bug.cgi?id=965125

            Bug ID: 965125
           Summary: kernel crashes with divide error: 0000 in e1000e
                    driver
    Classification: openSUSE
           Product: openSUSE Distribution
           Version: Leap 42.1
          Hardware: x86-64
                OS: openSUSE 42.1
            Status: NEW
          Severity: Critical
          Priority: P5 - None
         Component: Kernel
          Assignee: kernel-maintainers@forge.provo.novell.com
          Reporter: mt@suse.com
        QA Contact: qa-bugs@suse.de
          Found By: ---
           Blocker: ---

Unfortunately, I were unable to collect any crash dump with 4.1.15 kernel
as the machine freezes completely; but only with 4.1.12. I'll try to wait
much longer than 30min, maybe this helps:

[14410.374151] divide error: 0000 [#1] PREEMPT SMP 
[14410.374158] Modules linked in: fuse af_packet iscsi_ibft iscsi_boot_sysfs
pl2303 usbserial snd_hda_codec_hdmi dm_mod hid_generic snd_usb_audio
snd_usbmidi_lib snd_rawmidi usbhid snd_seq_device nls_iso8859_1 nls_cp437
snd_hda_codec_realtek vfat snd_hda_codec_generic fat x86_pkg_temp_thermal
intel_powerclamp coretemp snd_hda_intel kvm_intel snd_hda_controller
snd_hda_codec snd_hda_core snd_hwdep kvm crct10dif_pclmul snd_pcm crc32_pclmul
crc32c_intel dcdbas aesni_intel aes_x86_64 snd_timer lrw gf128mul glue_helper
ablk_helper cryptd ppdev serio_raw iTCO_wdt iTCO_vendor_support pcspkr
parport_pc i2c_i801 parport 8250_fintek tpm_tis tpm snd soundcore e1000e ptp
lpc_ich mfd_core mei_me pps_core mei shpchp processor efivarfs raid1 md_mod
nvidia_uvm(PO) nvidia(PO) sr_mod cdrom ehci_pci ehci_hcd usbcore usb_common
[14410.374215]  drm video button sg
[14410.374221] CPU: 6 PID: 17115 Comm: kworker/6:1 Tainted: P           O   
4.1.12-1-default #1
[14410.374224] Hardware name: Dell Inc. OptiPlex 990/06D7TR, BIOS A18
09/24/2013
[14410.374238] Workqueue: events e1000e_systim_overflow_work [e1000e]
[14410.374241] task: ffff88007c4dc5d0 ti: ffff88003f6ec000 task.ti:
ffff88003f6ec000
[14410.374243] RIP: 0010:[<ffffffffa0bea997>]  [<ffffffffa0bea997>]
e1000e_cyclecounter_read+0xa7/0xc0 [e1000e]
[14410.374254] RSP: 0018:ffff88003f6efdb0  EFLAGS: 00010046
[14410.374256] RAX: 0000000000000000 RBX: ffff88003f0377e0 RCX:
0000000000000000
[14410.374258] RDX: 0000000000000000 RSI: 0000000000000000 RDI:
ffff88003f0377c8
[14410.374260] RBP: ffff88003f037810 R08: 0000000000000032 R09:
0000000000000000
[14410.374262] R10: 00000007ffffffff R11: 0000000000000293 R12:
0000000000000292
[14410.374264] R13: ffff88003f6efdf8 R14: 0000000000000000 R15:
0000000000000180
[14410.374266] FS:  0000000000000000(0000) GS:ffff88021ed80000(0000)
knlGS:0000000000000000
[14410.374268] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[14410.374270] CR2: 0000000002479018 CR3: 0000000001e10000 CR4:
00000000000406e0
[14410.374272] Stack:
[14410.374274]  ffffffff810de041 ffff88003f0377c0 ffffffffa0bf6ccd
0000000000000000
[14410.374277]  ffff88003f037710 ffff8801ab4477c0 ffff88021ed95f00
ffffe8ffffb81d00
[14410.374281]  ffffffffa0bf6e1d ffffe8ffffb81d00 0000000000000000
ffff88003f037710
[14410.374284] Call Trace:
[14410.374299]  [<ffffffff810de041>] timecounter_read+0x11/0x50
[14410.374310]  [<ffffffffa0bf6ccd>] e1000e_phc_gettime+0x2d/0x60 [e1000e]
[14410.374320]  [<ffffffffa0bf6e1d>] e1000e_systim_overflow_work+0x1d/0x80
[e1000e]
[14410.374326]  [<ffffffff81080af2>] process_one_work+0x142/0x420
[14410.374330]  [<ffffffff81080ee4>] worker_thread+0x114/0x470
[14410.374336]  [<ffffffff81086581>] kthread+0xc1/0xe0
[14410.374342]  [<ffffffff8165f462>] ret_from_fork+0x42/0x70
[14410.374346] Code: f8 d5 ff ff 8b 80 00 b6 00 00 48 8b 97 f8 d5 ff ff 8b 8a
04 b6 00 00 48 c1 e1 20 89 c0 31 d2 48 09 c1 48 89 c8 48 29 f0 48 89 c6 <49> f7
f1 48 85 d2 75 c1 4c 39 d6 77 bc eb 80 66 2e 0f 1f 84 00 
[14410.374382] RIP  [<ffffffffa0bea997>] e1000e_cyclecounter_read+0xa7/0xc0
[e1000e]
[14410.374390]  RSP <ffff88003f6efdb0>

Interesting is, that the machine was running very stable until last week
where it started to happen for unknown reason... (I were suspecting it is
because of the nvidia driver, but it seems not...).

Also interesting is also the regularity of 4 hours + 1 minute:
# ls -l
total 16
drwxr-xr-x 2 root root 4096 Feb  4 00:52 2016-02-04-00:52
drwxr-xr-x 2 root root 4096 Feb  4 04:52 2016-02-04-04:52
drwxr-xr-x 2 root root 4096 Feb  4 08:53 2016-02-04-08:53
drwxr-xr-x 2 root root 4096 Feb  4 12:54 2016-02-04-12:54

Any time I look into the journal with "journalctl -b -X" after the
crash I see just some cron run logs, nothing else seems to happen:

Feb 04 04:30:01 xanthos cron[2964]: pam_unix(crond:session): session opened for
user root by (uid=0)
Feb 04 04:30:01 xanthos systemd[2965]: pam_unix(systemd-user:session): session
opened for user root by (uid=0)
Feb 04 04:30:01 xanthos CRON[2964]: pam_unix(crond:session): session closed for
user root
Feb 04 04:30:01 xanthos systemd[2966]: pam_unix(systemd-user:session): session
closed for user root
Feb 04 04:45:01 xanthos cron[3005]: pam_unix(crond:session): session opened for
user root by (uid=0)
Feb 04 04:45:01 xanthos systemd[3006]: pam_unix(systemd-user:session): session
opened for user root by (uid=0)
Feb 04 04:45:01 xanthos CRON[3005]: pam_unix(crond:session): session closed for
user root
Feb 04 04:45:01 xanthos systemd[3007]: pam_unix(systemd-user:session): session
closed for user root

except I'm doing something:

Feb 04 12:27:05 xanthos unix2_chkpwd[17122]: gkr-pam: unlocked login keyring
Feb 04 12:30:01 xanthos cron[17129]: pam_unix(crond:session): session opened
for user root by (uid=0)
Feb 04 12:30:01 xanthos systemd[17130]: pam_unix(systemd-user:session): session
opened for user root by (uid=0)
Feb 04 12:30:01 xanthos CRON[17129]: pam_unix(crond:session): session closed
for user root
Feb 04 12:30:01 xanthos systemd[17131]: pam_unix(systemd-user:session): session
closed for user root
Feb 04 12:45:01 xanthos cron[17236]: pam_unix(crond:session): session opened
for user root by (uid=0)
Feb 04 12:45:01 xanthos systemd[17237]: pam_unix(systemd-user:session): session
opened for user root by (uid=0)
Feb 04 12:45:01 xanthos CRON[17236]: pam_unix(crond:session): session closed
for user root
Feb 04 12:45:01 xanthos systemd[17238]: pam_unix(systemd-user:session): session
closed for user root
Feb 04 12:49:18 xanthos su[17290]: The gnome keyring socket is not owned with
the same credentials as the user login: /run/user/1050/keyring/control
Feb 04 12:49:18 xanthos su[17290]: gkr-pam: couldn't unlock the login keyring.
Feb 04 12:49:18 xanthos su[17290]: (to root) mt on pts/1
Feb 04 12:49:18 xanthos su[17290]: pam_unix(su-l:session): session opened for
user root by mt(uid=1050)
-- Reboot --

-- 
You are receiving this mail because:
You are on the CC list for the bug.

[Bug 965125] New: kernel crashes with divide error: 0000 in e1000e driver

bugzilla_noreply＠novell.com