Bug ID 1179632
Summary Oops in handle_cmd_completion+0x7e9/0x1120
Classification openSUSE
Product openSUSE Distribution
Version Leap 15.2
Hardware Other
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-bugs@opensuse.org
Reporter jeffm@suse.com
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

Created attachment 844147 [details]
dmesg.txt

I have a UAS device that disconnects occasionally, causing a mess of I/O
failures, SCSI disconnects, and USB resets.  Eventually, it oopses:

BUG: kernel NULL pointer dereference, address: 0000000000000008
#PF: supervisor read access in kernel mode
#PF: error_code(0x0000) - not-present page
PGD db5b067 P4D db5b067 PUD db59067 PMD 0
Oops: 0000 [#1] SMP NOPTI
CPU: 7 PID: 16121 Comm: duplicacy Kdump: loaded Not tainted
5.3.18-lp152.50-default #1 openSUSE Leap 15.2
Hardware name: System manufacturer System Product Name/PRIME Z370-A II, BIOS
1202 08/15/2019
RIP: 0010:handle_cmd_completion+0x7e9/0x1120 [xhci_hcd]
Code: 0f 85 ec 06 00 00 45 0f b7 6c 24 0e 8b 04 24 48 89 ef 41 83 e5 1f 41 8d
55 ff 4c 8d 6c c5 00 49 8b 85 98 01 00 00 89 54 24 10 <48> 8b 70 08 e8 de 82 ff
ff 48 89 c1 0f 1f 44 00 00 89 d9 48 c7 c2
RSP: 0000:ffffb20280270e28 EFLAGS: 00010002
RAX: 0000000000000000 RBX: 000000000000000b RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000fffffffe0 RDI: ffff90ac0764e270
RBP: ffff90ac0764e270 R08: ffff90ab3f842a70 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff90ab3f840810
R13: ffff90ac0764e2c8 R14: ffff90ae07eee920 R15: ffff90ac0764e320
FS:  00007f2145068700(0000) GS:ffff90b34edc0000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000000000008 CR3: 0000000004c9e006 CR4: 00000000003606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
 <IRQ>
 ? try_to_wake_up+0x460/0x4d0
 xhci_irq+0x2c7/0x440 [xhci_hcd]
 ? run_timer_softirq+0x75/0x440
 __handle_irq_event_percpu+0x46/0x1a0
 handle_irq_event_percpu+0x30/0x80
 handle_irq_event+0x3c/0x60
 handle_edge_irq+0x9b/0x1e0
 handle_irq+0x1f/0x30
 do_IRQ+0x49/0xd0
 common_interrupt+0xf/0xf
 </IRQ>


The RIP resolves to drivers/usb/host/xhci-ring.c:1157

1148 static void xhci_handle_cmd_reset_ep(struct xhci_hcd *xhci, int slot_id,
1149                 union xhci_trb *trb, u32 cmd_comp_code)
1150 {
1151         struct xhci_virt_device *vdev;
1152         struct xhci_ep_ctx *ep_ctx;
1153         unsigned int ep_index;
1154
1155         ep_index = TRB_TO_EP_INDEX(le32_to_cpu(trb->generic.field[3]));
1156         vdev = xhci->devs[slot_id];
1157         ep_ctx = xhci_get_ep_ctx(xhci, vdev->out_ctx, ep_index);
1158         trace_xhci_handle_cmd_reset_ep(ep_ctx);
1159
1160         /* This command will only fail if the endpoint wasn't halted,
1161          * but we don't care.
1162          */
1163         xhci_dbg_trace(xhci, trace_xhci_dbg_reset_ep,
1164                 "Ignoring reset ep completion code of %u", cmd_comp_code);
1165

I expect that xhci->devs[slot_id] is NULL and is getting cleared via
xhci_free_virt_device when the device disappears.


You are receiving this mail because: