[Bug 1231599] New: hard freezes related to btusb device 04ca:3802
https://bugzilla.suse.com/show_bug.cgi?id=1231599 Bug ID: 1231599 Summary: hard freezes related to btusb device 04ca:3802 Classification: openSUSE Product: openSUSE Tumbleweed Version: Current Hardware: x86-64 OS: openSUSE Tumbleweed Status: NEW Severity: Normal Priority: P5 - None Component: Kernel:Drivers Assignee: kernel-bugs@suse.de Reporter: best.scouring105@passinbox.com QA Contact: qa-bugs@suse.de Target Milestone: --- Found By: --- Blocker: --- User reports hard system freezes which happen some time after resume from sleep. Appears to be related to btusb and device 04ca:3802 Lite-On Technology Corp. Wireless_Device, a Mediatek USB Bluetooth module which is internally connected in this case. Freezes began after upgrade to kernel 6.11.0, latest log is with 6.11.2-1 Multiple logs show this sequence of events 1. resume from sleep 2. usb 1-4 errors 3. `kernel: Oops: general protection fault` (hci_unregister_dev, btusb_disconnect) 4. `kernel: BUG: workqueue lockup` several times that match the oops 5. total freeze, presumably kernel panic Forum thread with logs: https://forums.opensuse.org/t/experencing-random-crashes-ever-since-20240927... Log excerpts: usb errors and kernel oops: (resume at Oct 10 19:19:39) Oct 10 19:19:54 kernel: usb 1-4: Failed to suspend device, error -110 Oct 10 19:20:06 kernel: usb 1-4: Failed to suspend device, error -110 Oct 10 19:38:46 kernel: usb 1-4: device descriptor read/64, error -110 Oct 10 19:38:46 kernel: usb 1-4: reset high-speed USB device number 4 using xhci_hcd Oct 10 19:38:52 kernel: xhci_hcd 0000:05:00.3: Timeout while waiting for setup device command Oct 10 19:38:57 kernel: xhci_hcd 0000:05:00.3: Timeout while waiting for setup device command Oct 10 19:38:57 kernel: usb 1-4: device not accepting address 4, error -62 Oct 10 19:38:57 kernel: usb 1-4: reset high-speed USB device number 4 using xhci_hcd Oct 10 19:39:02 kernel: xhci_hcd 0000:05:00.3: Timeout while waiting for setup device command Oct 10 19:39:05 systemd-udevd[766]: 1-4: Worker [27541] processing SEQNUM=4265 is taking a long time Oct 10 19:39:08 kernel: xhci_hcd 0000:05:00.3: Timeout while waiting for setup device command Oct 10 19:39:08 fwupd[27462]: 18:39:08.494 FuUsbDevice failed to load BOS descriptor from USB device: USB error on device 04ca:3802 : Input/Output Error [-1] Oct 10 19:39:08 systemd[1]: Starting Load/Save RF Kill Switch Status... Oct 10 19:39:08 kernel: usb 1-4: device not accepting address 4, error -62 Oct 10 19:39:08 kernel: usb 1-4: USB disconnect, device number 4 Oct 10 19:39:08 kernel: Oops: general protection fault, probably for non-canonical address 0xdead000000000108: 0000 [#1] PREEMPT SMP NOPTI Oct 10 19:39:08 kernel: CPU: 0 UID: 0 PID: 7129 Comm: kworker/0:2 Tainted: P W O 6.11.2-1-default #1 openSUSE Tumbleweed e7184aff5e8c765d07bd8e233cb429101cfc70a8 Oct 10 19:39:08 kernel: Tainted: [P]=PROPRIETARY_MODULE, [W]=WARN, [O]=OOT_MODULE Oct 10 19:39:08 kernel: Hardware name: Acer Aspire A715-42G/Azalea_CAS, BIOS V1.08 09/15/2021 Oct 10 19:39:08 kernel: Workqueue: usb_hub_wq hub_event [usbcore] Oct 10 19:39:08 kernel: RIP: 0010:hci_unregister_dev+0x4c/0x1e0 [bluetooth] Oct 10 19:39:08 kernel: Code: f0 80 8b e9 0e 00 00 08 48 89 ef e8 4e e0 5f e5 48 c7 c7 e8 0c 1e c2 e8 02 43 60 e5 48 8b 13 48 8b 43 08 48 c7 c7 e8 0c 1e c2 <48> 89 42 08 48 89 10 48 b8 00 01 00 00 00 00 ad de 48 89 03 48 83 Oct 10 19:39:08 kernel: RSP: 0018:ffffaa60cdfefba0 EFLAGS: 00010246 Oct 10 19:39:08 kernel: RAX: dead000000000122 RBX: ffff9ae27b412000 RCX: 0000000000000000 Oct 10 19:39:08 kernel: RDX: dead000000000100 RSI: ffff9ae144152c50 RDI: ffffffffc21e0ce8 Oct 10 19:39:08 kernel: RBP: ffff9ae27b4124d0 R08: 0000000000000000 R09: ffff9ae1401cdb10 Oct 10 19:39:08 kernel: R10: ffffaa60cdfefba8 R11: ffffaa60cdfefbb0 R12: ffff9ae27b412000 Oct 10 19:39:08 kernel: R13: ffffffffc1db6278 R14: ffffffffc1db6278 R15: ffff9ae1df9b8050 Oct 10 19:39:08 kernel: FS: 0000000000000000(0000) GS:ffff9ae43e200000(0000) knlGS:0000000000000000 Oct 10 19:39:08 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Oct 10 19:39:08 kernel: CR2: 00007f630bb3d000 CR3: 0000000196e22000 CR4: 0000000000350ef0 Oct 10 19:39:08 kernel: Call Trace: Oct 10 19:39:08 kernel: <TASK> Oct 10 19:39:08 kernel: ? __die_body.cold+0x19/0x26 Oct 10 19:39:08 kernel: ? die_addr+0x3c/0x60 Oct 10 19:39:08 kernel: ? exc_general_protection+0x175/0x3f0 Oct 10 19:39:08 kernel: ? asm_exc_general_protection+0x26/0x30 Oct 10 19:39:08 kernel: ? hci_unregister_dev+0x4c/0x1e0 [bluetooth 340bac5c71bf02fd7707953532fb0e90f0dfe33a] Oct 10 19:39:08 kernel: btusb_disconnect+0x67/0x170 [btusb 8d1c4c7e627b70f5c6fd39af1f0f23de98da9535] Oct 10 19:39:08 kernel: usb_unbind_interface+0x93/0x290 [usbcore 2e714cc5ca1bc0f63406ccc6aa80d83ee64296ce] Oct 10 19:39:08 kernel: device_release_driver_internal+0x19c/0x200 Oct 10 19:39:08 kernel: bus_remove_device+0xc6/0x130 Oct 10 19:39:08 kernel: device_del+0x161/0x3d0 Oct 10 19:39:08 kernel: ? srso_return_thunk+0x5/0x5f Oct 10 19:39:08 kernel: ? kobject_put+0xa0/0x1d0 Oct 10 19:39:08 kernel: usb_disable_device+0x104/0x220 [usbcore 2e714cc5ca1bc0f63406ccc6aa80d83ee64296ce] Oct 10 19:39:08 kernel: usb_disconnect+0xe6/0x2e0 [usbcore 2e714cc5ca1bc0f63406ccc6aa80d83ee64296ce] Oct 10 19:39:08 kernel: hub_event+0xde6/0x1930 [usbcore 2e714cc5ca1bc0f63406ccc6aa80d83ee64296ce] Oct 10 19:39:08 kernel: ? move_pfn_range_to_zone+0x191/0x1f0 Oct 10 19:39:08 kernel: process_one_work+0x16b/0x320 Oct 10 19:39:08 kernel: worker_thread+0x2ea/0x420 Oct 10 19:39:08 kernel: ? __pfx_worker_thread+0x10/0x10 Oct 10 19:39:08 kernel: kthread+0xd2/0x100 Oct 10 19:39:08 kernel: ? __pfx_kthread+0x10/0x10 Oct 10 19:39:08 kernel: ret_from_fork+0x34/0x50 Oct 10 19:39:08 kernel: ? __pfx_kthread+0x10/0x10 Oct 10 19:39:08 kernel: ret_from_fork_asm+0x1a/0x30 Oct 10 19:39:08 kernel: </TASK> Oct 10 19:39:08 kernel: Modules linked in: udp_diag tcp_diag inet_diag snd_seq_dummy snd_hrtimer snd_seq snd_seq_device rfcomm wireguard libchacha20poly1305 chacha_x86_64 poly1305_x86_64 curve25519_x86_64 libcurve25519_generic libchacha ip6_udp_tunnel udp_tunnel af_packet nvidia_drm(PO) nvidia_modeset(PO) nvidia_uvm(PO) ccm algif_aead des3_ede_x86_64 nvidia(PO) des_generic libdes algif_skcipher cmac md4 algif_hash af_alg qrtr bnep nf_tables iptable_filter ext4 nls_iso8859_1 nls_cp437 mbcache vfat jbd2 fat snd_ctl_led snd_sof_amd_acp63 snd_sof_amd_vangogh snd_sof_amd_rembrandt snd_sof_amd_renoir snd_sof_amd_acp snd_sof_pci snd_sof_xtensa_dsp snd_sof snd_sof_utils snd_pci_ps snd_amd_sdw_acpi snd_hda_codec_realtek soundwire_amd soundwire_generic_allocation soundwire_bus snd_hda_codec_generic snd_hda_scodec_component mt7921e snd_hda_codec_hdmi intel_rapl_msr amd_atl mt7921_common snd_soc_core intel_rapl_common uvcvideo btusb mt792x_lib edac_mce_amd snd_compress snd_hda_intel btrtl videobuf2_vmalloc snd_pcm_dmaengine uvc Oct 10 19:39:08 kernel: snd_intel_dspcfg mt76_connac_lib btintel snd_rpl_pci_acp6x snd_intel_sdw_acpi videobuf2_memops mt76 btbcm videobuf2_v4l2 snd_acp_pci snd_hda_codec kvm_amd btmtk mac80211 videodev snd_acp_legacy_common ee1004 bluetooth snd_hda_core acer_wmi r8169 libarc4 snd_pci_acp6x videobuf2_common snd_hwdep platform_profile mc kvm pcspkr snd_pci_acp5x sparse_keymap wmi_bmof snd_pcm realtek mdio_devres cfg80211 snd_rn_pci_acp3x snd_timer snd_acp_config i2c_piix4 k10temp snd snd_soc_acpi soundcore libphy i2c_smbus rfkill snd_pci_acp3x ac joydev acer_wireless tiny_power_button loop nvme_fabrics fuse efi_pstore dm_mod configfs nfnetlink dmi_sysfs ip_tables x_tables crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic amdgpu ahci libahci ghash_clmulni_intel amdxcp i2c_algo_bit sha512_ssse3 libata drm_ttm_helper ttm sha256_ssse3 drm_exec sha1_ssse3 gpu_sched drm_suballoc_helper xhci_pci sd_mod drm_buddy xhci_pci_renesas nvme scsi_dh_emc hid_multitouch drm_display_helper xhci_hcd scsi_dh_rdac hid_generic aesni_intel Oct 10 19:39:08 kernel: nvme_core scsi_dh_alua cec sg gf128mul scsi_mod usbcore crypto_simd cryptd ccp scsi_common rc_core sp5100_tco nvme_auth video battery wmi i2c_hid_acpi i2c_hid button serio_raw btrfs blake2b_generic libcrc32c crc32c_intel xor raid6_pq pkcs8_key_parser msr i2c_dev efivarfs Oct 10 19:39:08 kernel: ---[ end trace 0000000000000000 ]--- workqueue lockup: Oct 10 19:39:59 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 50s! Oct 10 19:40:30 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 81s! Oct 10 19:41:00 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 112s! Oct 10 19:41:31 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 143s! Oct 10 19:42:02 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 173s! Oct 10 19:42:33 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 204s! Oct 10 19:43:03 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 235s! Oct 10 19:43:34 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 266s! Oct 10 19:44:05 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 296s! Oct 10 19:44:35 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 327s! Oct 10 19:45:06 kernel: BUG: workqueue lockup - pool cpus=0 node=0 flags=0x0 nice=0 stuck for 358s! -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c1 Ely G <elydgolden@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |elydgolden@gmail.com --- Comment #1 from Ely G <elydgolden@gmail.com> --- I am the user in question; I can confirm this behaviour in 20241007 but I cannot yet reproduce it manually -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c3 --- Comment #3 from Ely G <elydgolden@gmail.com> --- Yep: Oct 14 11:54:42 Elys-Aspire-A715-42G kernel: usb 1-4: Failed to suspend device, error -110 I can also link you directly to all logs I've posted [1](https://pastebin.com/raw/KYHinsjb) [2](https://www.dropbox.com/scl/fi/pbxvs20z5gnsjs07ki806/log.txt?rlkey=ftcxnkcmfadd0bp635xh0pp6b&st=3xlxfa8e&raw=1) [3](https://www.dropbox.com/scl/fi/tcbihb2pqqejtsqr2lg8o/log2.txt?rlkey=a20o40qb0iwn6ycwmqal52sv1&st=0wkpzlm9&raw=1) [4](https://www.dropbox.com/scl/fi/cc6dz40ighgjum2vr4op4/log3.txt?rlkey=ousz7ppbb23956z5gkostcua6&st=6h8ibl09&raw=1) [5](https://www.dropbox.com/scl/fi/2k2qganhem2h5ax0k8vos/log4.txt?rlkey=wpumvazg4tuwss9k8jb0de55y&st=u5ynm94y&raw=1) -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c5 --- Comment #5 from Ely G <elydgolden@gmail.com> --- Created attachment 877973 --> https://bugzilla.suse.com/attachment.cgi?id=877973&action=edit Crash logs -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c7 --- Comment #7 from Ely G <elydgolden@gmail.com> --- Sure; I just downloaded it. Should I run it for a few days without rebooting to see if I don't get freezes or? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c9 --- Comment #9 from Ely G <elydgolden@gmail.com> --- Created attachment 878099 --> https://bugzilla.suse.com/attachment.cgi?id=878099&action=edit journalctl messages on unpatched kernel 1 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c10 --- Comment #10 from Ely G <elydgolden@gmail.com> --- Created attachment 878100 --> https://bugzilla.suse.com/attachment.cgi?id=878100&action=edit journalctl messages on unpatched kernel 2 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c11 --- Comment #11 from Ely G <elydgolden@gmail.com> --- Created attachment 878101 --> https://bugzilla.suse.com/attachment.cgi?id=878101&action=edit journalctl messages on patched kernel -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c12 --- Comment #12 from Ely G <elydgolden@gmail.com> --- Created attachment 878102 --> https://bugzilla.suse.com/attachment.cgi?id=878102&action=edit dmesg messages on unpatched kernel 1 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c13 --- Comment #13 from Ely G <elydgolden@gmail.com> --- Created attachment 878103 --> https://bugzilla.suse.com/attachment.cgi?id=878103&action=edit dmesg messages on unpatched kernel 2 -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c14 --- Comment #14 from Ely G <elydgolden@gmail.com> --- Created attachment 878104 --> https://bugzilla.suse.com/attachment.cgi?id=878104&action=edit dmesg messages on patched kernel -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c15 Ely G <elydgolden@gmail.com> changed: What |Removed |Added ---------------------------------------------------------------------------- Flags|needinfo? | --- Comment #15 from Ely G <elydgolden@gmail.com> --- (In reply to Takashi Iwai from comment #8)
Ideally speaking, yes.
Seems like the freezing has stopped, even after multiple sleep/wake cycles 🥳🥳 Following the advice of third-coffee-of-the-hour in the forums (https://forums.opensuse.org/t/experencing-random-crashes-ever-since-20240927...) I recorded the outputs of dmesg -W and journalctl -fp4 while sleep/resuming the machine. I did this with both the ordinary and patched kernels and there seems to be a significant difference but I don't know what as I still cannot reproduce it on demand in the unpatched kernel -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c17 --- Comment #17 from OBSbugzilla Bot <bwiedemann+obsbugzillabot@suse.com> --- This is an autogenerated message for OBS integration: This bug (1231599) was mentioned in https://build.opensuse.org/request/show/1217138 Factory / kernel-source -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c18 --- Comment #18 from Ely G <elydgolden@gmail.com> --- (In reply to Takashi Iwai from comment #16)
Good to hear. Most of the rest kernel warnings are from Nvidia driver, hence they are irrelevant with this bug.
I pushed the tentative fix patch https://lore.kernel.org/20240822052310.25220-1-hao.qin@mediatek.com to TW kernel, so that the later 6.11.x TW kernel will contain the fix. Meanwhile, the bug should be addressed in the upstream properly, and let's keep this entry opened for that.
Seems the automated OBS bot slated the patch for inclusion in the 6.11.4 and 6.11.5 line. Since I can't currently update my system properly running your branch of the patched kernel, when will I know that I can safely switch back to the mainline kernel? -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c21 --- Comment #21 from Fill <best.scouring105@passinbox.com> --- (In reply to Ely G from comment #19)
I assume this is fixed in snapshot 20241024? Seems to have the 6.11.5 kernel
Yes, the patch that addresses this is included in Tumbleweed's kernel-default-6.11.5 package, released with 20241024. It wasn't included in the previous version, TW's 6.11.3-2, and 6.11.4 was skipped. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c22 --- Comment #22 from Fill <best.scouring105@passinbox.com> --- And as Takashi Iwai says, you can verify that you have it with something like rpm -q --changelog kernel-default | grep -C2 1231599 which should produce * Wed Oct 16 2024 tiwai@suse.de - Bluetooth: btmtk: Remove resetting mt7921 before downloading the fw (bsc#1231599). - commit a3c998f Note that since the patch isn't mainlined (yet?), I wouldn't necessarily expect other distros to include it. It sounds like there's still some investigating being done by the manufacturer. -- You are receiving this mail because: You are on the CC list for the bug.
https://bugzilla.suse.com/show_bug.cgi?id=1231599 https://bugzilla.suse.com/show_bug.cgi?id=1231599#c27 Frank Krüger <fkrueger@mailbox.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |fkrueger@mailbox.org --- Comment #27 from Frank Krüger <fkrueger@mailbox.org> --- (In reply to Takashi Iwai from comment #26)
(In reply to Rojon from comment #24)
Hello everyone,
I want to report that even with the patch by Takashi Iwai (thanks!), I am regularly experiencing hard freezes which force me to hard reboot the system. While this most often happens when resuming from sleep, I just had a hard freeze during normal usage (AFAIK). I have been updating my system daily ever since.
Are you sure that your problem is really related with this bug? This is about the USB device 04ca:3802. If not, please open another bug report.
JFYI: There seems to be a similar issue discussed in the openSUSE forum at https://forums.opensuse.org/t/tumbleweed-wont-recover-from-sleep-force-shutd..., which exists even for kernel 6.11.6, but is solved by turning off BT. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com