Bug ID | 1087635 |
---|---|
Summary | r8152 livelocks during pm_runtime_suspend |
Classification | openSUSE |
Product | openSUSE Tumbleweed |
Version | Current |
Hardware | Other |
OS | Other |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | jslaby@suse.com |
QA Contact | qa-bugs@suse.de |
CC | tbogendoerfer@suse.com |
Found By | --- |
Blocker | --- |
Created attachment 765580 [details] full dmesg after sysrq-t I reported it to upstream too: https://marc.info/?l=linux-pm&m=152241584411483&w=2 The kernel is 4.15.14-15.gdef7e44-default. I have seen r8152 from my docking station to kill my box several times in the last few days. The notebook is new, so I don't know if this is a regression. I have the NIC connected all the time. And when I return to the notebook after a while, the networking is dead. Looking at the stack traces, it is clear, that r8152 was attempted to be autosuspended and waits in napi_disable for NAPI_STATE_SCHED bit to be cleared: [22001.018437] kworker/2:0 D 0 16267 2 0x80000000 [22001.018441] Workqueue: pm pm_runtime_work [22001.018443] Call Trace: [22001.018453] schedule+0x2f/0x90 [22001.018455] schedule_timeout+0x1ce/0x540 [22001.018474] msleep+0x29/0x30 [22001.018477] napi_disable+0x25/0x60 [22001.018483] rtl8152_suspend+0x20a/0x2d0 [r8152] [22001.018493] usb_suspend_both+0x8d/0x200 [usbcore] [22001.018510] usb_runtime_suspend+0x2a/0x70 [usbcore] [22001.018514] __rpm_callback+0xbc/0x1f0 [22001.018519] rpm_callback+0x4f/0x70 [22001.018526] rpm_suspend+0x11d/0x6d0 [22001.018532] pm_runtime_work+0x73/0xb0 [22001.018535] process_one_work+0x269/0x6c0 [22001.018541] worker_thread+0x2b/0x3d0 [22001.018547] kthread+0x113/0x130 [22001.018556] ret_from_fork+0x24/0x50 The assembly: > ffffffff81716730 <napi_disable>: > ffffffff81716730: e8 eb b7 2e 00 callq ffffffff81a01f20 <__fentry__> > ffffffff81716735: 55 push %rbp > ffffffff81716736: 48 89 fd mov %rdi,%rbp > ffffffff81716739: 53 push %rbx > ffffffff8171673a: 48 8d 5f 10 lea 0x10(%rdi),%rbx > ffffffff8171673e: f0 80 4f 10 04 lock orb $0x4,0x10(%rdi) > ffffffff81716743: f0 0f ba 6f 10 00 lock btsl $0x0,0x10(%rdi) > ffffffff81716749: 73 11 jae ffffffff8171675c <napi_disable+0x2c> > ffffffff8171674b: bf 01 00 00 00 mov $0x1,%edi > ffffffff81716750: e8 ab ac a0 ff callq ffffffff81121400 <msleep> > ffffffff81716755: f0 0f ba 2b 00 lock btsl $0x0,(%rbx) > ffffffff8171675a: 72 ef jb ffffffff8171674b <napi_disable+0x1b> There are other tasks in D state, of course, like these, waiting for the device to become pm-up: [22001.018749] kworker/3:1 D 0 16798 2 0x80000000 [22001.018753] Workqueue: events rtl_work_func_t [r8152] [22001.018755] Call Trace: [22001.018767] schedule+0x2f/0x90 [22001.018769] rpm_resume+0xf9/0x860 [22001.018777] rpm_resume+0x592/0x860 [22001.018783] __pm_runtime_resume+0x3a/0x50 [22001.018789] usb_autopm_get_interface+0x1d/0x50 [usbcore] [22001.018793] rtl_work_func_t+0x3e/0x405 [r8152] [22001.018801] process_one_work+0x269/0x6c0 [22001.018807] worker_thread+0x2b/0x3d0 [22001.018813] kthread+0x113/0x130 [22001.018822] ret_from_fork+0x24/0x50 [22001.019713] tcpdump D 0 17119 4265 0x00000004 [22001.019716] Call Trace: [22001.019728] schedule+0x2f/0x90 [22001.019730] rpm_resume+0xf9/0x860 [22001.019738] rpm_resume+0x592/0x860 [22001.019744] __pm_runtime_resume+0x3a/0x50 [22001.019750] usb_autopm_get_interface+0x1d/0x50 [usbcore] [22001.019754] rtl8152_ioctl+0x30/0x140 [r8152] [22001.019758] dev_ifsioc+0x115/0x3f0 [22001.019763] dev_ioctl+0x14b/0x680 [22001.019775] sock_do_ioctl+0x41/0x50 [22001.019778] sock_ioctl+0x1c2/0x2f0 [22001.019781] do_vfs_ioctl+0x91/0x680 [22001.019789] SyS_ioctl+0x74/0x80 [22001.019794] do_syscall_64+0x76/0x1c0 ... > Showing all locks held in the system: > 1 lock held by in:imklog/1371: > #0: (&f->f_pos_lock){+.+.}, at: [<00000000a0b38807>] __fdget_pos+0x3f/0x50 > 1 lock held by Qt bearer threa/3003: > #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0 > 1 lock held by Qt bearer threa/2825: > #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0 > 1 lock held by DNS Res~ver #40/17041: > #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0 > 1 lock held by Qt bearer threa/3110: > #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0 > 1 lock held by DNS Res~ver #16/17044: > #0: (rtnl_mutex){+.+.}, at: [<0000000021e0bca0>] __netlink_dump_start+0x4c/0x1b0 > 2 locks held by bash/4561: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e3d76e61>] tty_ldisc_ref_wait+0x24/0x50 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000091462d05>] n_tty_read+0xc3/0x850 > 3 locks held by kworker/2:0/16267: > #0: ((wq_completion)"pm"){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0 > #1: ((work_completion)(&dev->power.work)){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0 > #2: (&tp->control){+.+.}, at: [<00000000ca575b90>] rtl8152_suspend+0x2b/0x2d0 [r8152] > 2 locks held by kworker/3:1/16798: > #0: ((wq_completion)"events"){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0 > #1: ((work_completion)(&(&tp->schedule)->work)){+.+.}, at: [<00000000b9dc0832>] process_one_work+0x1e3/0x6c0 > 1 lock held by tcpdump/17119: > #0: (rtnl_mutex){+.+.}, at: [<0000000023a6461d>] dev_ioctl+0x13d/0x680 > 2 locks held by less/17187: > #0: (&tty->ldisc_sem){++++}, at: [<00000000e3d76e61>] tty_ldisc_ref_wait+0x24/0x50 > #1: (&ldata->atomic_read_lock){+.+.}, at: [<0000000091462d05>] n_tty_read+0xc3/0x850 For now, I disabled pm-runtime on the device by: echo on > /sys/bus/usb/devices/4-1.2/power/control Any ideas what's wrong? napi_disable from runtime suspend? Double napi_disable on the path? Some missing pm_runtime_get_sync somewhere?