Re: TW 20250216 Kernel 6.13.2 USB issues

24 Feb 2025

      On 2/20/25 4:13 PM, Joe Salmeri wrote:
...
In January I was running TW 20250106 which was using kernel 6.12.8 and I 
ran into multiple USB corruption issues.
At first I thought that the USB drive might be failing but then it 
happened with a 2nd USB drive getting corrupted.
Using TW 20250106 I booted using kernel 6.11.8 instead of 6.12.8 and all 
USB problems went away and both drives worked fine.
Research I see that there were lots of people reporting USB problems 
with kernel 6.12.8.
For the rest of Jan and up through 02/17/2025 I continued to run TW 
20250106 using kernel 6.11.8 ( instead of 6.12.8 ) and using multiple 
USB devices all month long with NO problems.
Then I updated to TW 20250216 using kernel 6.13.2.
I see that up through kernel 6.13.rc4 there were still USB problems but 
figured I try with the released 6.13.2 kernel and TW 20250216.
It "had" been working fine until earlier to day when I hit a kernel BUG 
while accessing a USB drive ( different drive than from last month ).
Here is the journal:
Feb 20 15:39:22 kernel: BUG: unable to handle page fault for address: 
ffffffad9c1c7800
Feb 20 15:39:22 kernel: #PF: supervisor instruction fetch in kernel mode
Feb 20 15:39:22 kernel: #PF: error_code(0x0010) - not-present page
Feb 20 15:39:22 kernel: PGD 108e03d067 P4D 108e03d067 PUD 0
Feb 20 15:39:22 kernel: Oops: Oops: 0010 [#1] PREEMPT SMP NOPTI
Feb 20 15:39:22 kernel: CPU: 6 UID: 1000 PID: 69034 Comm: python3 Not 
tainted 6.13.2-1-default #1 openSUSE Tumbleweed 
cdfe16bec344147391efeacaa0fc0377c0d20a85
Feb 20 15:39:22 kernel: Hardware name: ASUS System Product Name/ROG 
MAXIMUS Z790 FORMULA, BIOS 1202 04/18/2024
Feb 20 15:39:22 kernel: RIP: 0010:0xffffffad9c1c7800
Feb 20 15:39:22 kernel: Code: Unable to access opcode bytes at 
0xffffffad9c1c77d6.
Feb 20 15:39:22 kernel: RSP: 0018:ffffbe6d8c62b8af EFLAGS: 00010246
Feb 20 15:39:22 kernel: RAX: 0000000000000000 RBX: 0000000000002400 RCX: 
0000000000000006
Feb 20 15:39:22 kernel: RDX: ffff96d350380000 RSI: ffff96d3e9a39048 RDI: 
ffff96d350380000
Feb 20 15:39:22 kernel: RBP: 00000000000083ff R08: ffff96d80dc4f600 R09: 
0000000000000000
Feb 20 15:39:22 kernel: R10: 0000000000000001 R11: ffff96d3e9a39000 R12: 
000000fffbffff00
Feb 20 15:39:22 kernel: R13: ffe70d318e790000 R14: ffffbe6d8c62b978 R15: 
ffff96d29fe022a0
Feb 20 15:39:22 kernel: FS:  00007fddbafab580(0000) 
GS:ffff96e1feb00000(0000) knlGS:0000000000000000
Feb 20 15:39:22 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 20 15:39:22 kernel: CR2: ffffffad9c1c77d6 CR3: 00000009a865a001 CR4: 
0000000000f72ef0
Feb 20 15:39:22 kernel: PKRU: 55555554
Feb 20 15:39:22 kernel: Call Trace:
Feb 20 15:39:22 kernel:  <TASK>
Feb 20 15:39:22 kernel:  ? __die_body.cold+0x19/0x26
Feb 20 15:39:22 kernel:  ? page_fault_oops+0x132/0x2a0
Feb 20 15:39:22 kernel:  ? exc_page_fault+0x160/0x170
Feb 20 15:39:22 kernel:  ? asm_exc_page_fault+0x26/0x30
Feb 20 15:39:22 kernel:  ? page_cache_ra_unbounded+0x198/0x200
Feb 20 15:39:22 kernel:  ? filemap_get_pages+0x565/0x6f0
Feb 20 15:39:22 kernel:  ? filemap_read+0xec/0x370
Feb 20 15:39:22 kernel:  ? filemap_read+0x33c/0x370
Feb 20 15:39:22 kernel:  ? aa_file_perm+0x122/0x4e0
Feb 20 15:39:22 kernel:  ? apparmor_file_permission+0x75/0x190
Feb 20 15:39:22 kernel:  ? vfs_read+0x25f/0x330
Feb 20 15:39:22 kernel:  ? ksys_read+0x64/0xe0
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x82/0x160
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? syscall_exit_to_user_mode+0x37/0x1d0
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? do_pselect.constprop.0+0xd7/0x170
Feb 20 15:39:22 kernel:  ? syscall_exit_to_user_mode+0x37/0x1d0
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? syscall_exit_to_user_mode+0x37/0x1d0
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? do_pselect.constprop.0+0xd7/0x170
Feb 20 15:39:22 kernel:  ? syscall_exit_to_user_mode+0x37/0x1d0
Feb 20 15:39:22 kernel:  ? do_syscall_64+0x8e/0x160
Feb 20 15:39:22 kernel:  ? switch_fpu_return+0x4e/0xd0
Feb 20 15:39:22 kernel:  ? arch_exit_to_user_mode_prepare.isra.0+0x79/0x90
Feb 20 15:39:22 kernel:  ? entry_SYSCALL_64_after_hwframe+0x76/0x7e
Feb 20 15:39:22 kernel:  </TASK>
Feb 20 15:39:22 kernel: Modules linked in: exfat vhost_net tun vhost 
vhost_iotlb macvtap macvlan tap nft_reject_ipv4 act_csum cls_u32 sch_htb 
nf_nat_tftp nf_conntrack_tftp>
Feb 20 15:39:22 kernel:  snd_sof_xtensa_dsp snd_sof snd_sof_utils 
snd_soc_acpi_intel_match snd_soc_acpi_intel_sdca_quirks 
soundwire_generic_allocation snd_soc_acpi soundwi>
Feb 20 15:39:22 kernel:  tiny_power_button mc kvm joydev pcspkr 
thunderbolt wmi_bmof libphy soundcore i2c_mux spi_intel 
serial_multi_instantiate rfkill intel_vsec pmt_clas>
Feb 20 15:39:22 kernel: CR2: ffffffad9c1c7800
Feb 20 15:39:22 kernel: ---[ end trace 0000000000000000 ]---
Feb 20 15:39:22 kernel: RIP: 0010:0xffffffad9c1c7800
Feb 20 15:39:22 kernel: Code: Unable to access opcode bytes at 
0xffffffad9c1c77d6.
Feb 20 15:39:22 kernel: RSP: 0018:ffffbe6d8c62b8af EFLAGS: 00010246
Feb 20 15:39:22 kernel: RAX: 0000000000000000 RBX: 0000000000002400 RCX: 
0000000000000006
Feb 20 15:39:22 kernel: RDX: ffff96d350380000 RSI: ffff96d3e9a39048 RDI: 
ffff96d350380000
Feb 20 15:39:22 kernel: RBP: 00000000000083ff R08: ffff96d80dc4f600 R09: 
0000000000000000
Feb 20 15:39:22 kernel: R10: 0000000000000001 R11: ffff96d3e9a39000 R12: 
000000fffbffff00
Feb 20 15:39:22 kernel: R13: ffe70d318e790000 R14: ffffbe6d8c62b978 R15: 
ffff96d29fe022a0
Feb 20 15:39:22 kernel: FS:  00007fddbafab580(0000) 
GS:ffff96e1feb00000(0000) knlGS:0000000000000000
Feb 20 15:39:22 kernel: CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 20 15:39:22 kernel: CR2: ffffffad9c1c77d6 CR3: 00000009a865a001 CR4: 
0000000000f72ef0
Feb 20 15:39:22 kernel: PKRU: 55555554
Feb 20 15:39:22 kernel: note: python3[69034] exited with irqs disabled
After that occurs the system seems to still be running fine, however, 
the kernel is tainted with a value of 128 which say the kernel 
experienced a death event ( RIP ).
This is very much like the issue I saw with kernel 6.12.8 that did not 
occur with kernel 6.11.8.
Anybody else seeing anything like this ?
Just had this happen again.

A common denominator seems to be that it is occurring when a python 
program is running which is reading a bunch of files from a USB drive.

It does not happen all the time though because after it happens the 
kernel is tainted with a value of 128 so I reboot.

After rebooting, I can plugin the same drive and run the same python 
program to read the data and it completes successfully.

Unlike kernel 6.12.8 which had major USB problems including drive 
corruption, I have only had this not-present-page issue with kernel 6.13.2

-- 
Regards,

Joe

Main

Development

Information

Community

Social Media

Other

Re: TW 20250216 Kernel 6.13.2 USB issues

Joe Salmeri