Kernel 6.6.1-1.1 Problems
I update to TW 20231113 today which is using Kernel 6.6.1-1.1. I am using VMWare WorkStation Player 17.5 The modules compile fine and then I sign them with my key and everything works fine. When I bring up a VM it also works fine until I attempt to shut it down. At that point it starts the vm shutdown but then the process hangs. At that point the journal has the following messages: Nov 16 16:55:24 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-...D } 18235 jiffies s: 1213 root: 0x10/. Nov 16 16:55:24 kernel: rcu: blocking rcu_node structures (internal RCU debug): Nov 16 16:55:24 kernel: Sending NMI from CPU 3 to CPUs 4: Nov 16 16:55:24 kernel: NMI backtrace for cpu 4 skipped: idling at intel_idle+0x62/0xb0 TW becomes less responsive and I end up having to reboot. Looking at the journal I found the following trace information after I rebooted Nov 16 16:26:45 kernel: WARNING: CPU: 3 PID: 6026 at kernel/rcu/tree_plugin.h:734 rcu_sched_clock_irq+0xb2c/0x1120 Nov 16 16:26:45 kernel: Modules linked in: vmnet(O) vmmon(O) binfmt_misc snd_seq_dummy snd_hrtimer snd_seq af_packet nf_conntrack_netbios_ns nf_conntrack_b> Nov 16 16:26:45 kernel: irqbypass wmi_bmof rfkill i2c_i801 mxm_wmi snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi pcspkr i2c_smbus efi_pstore uvcvideo > Nov 16 16:26:45 kernel: CPU: 3 PID: 6026 Comm: vmware-vmx Tainted: G O 6.6.1-1-default #1 openSUSE Tumbleweed 0c6504f7d2c054731662677f280b3> Nov 16 16:26:45 kernel: Hardware name: ASUS All Series/MAXIMUS VI FORMULA, BIOS 1603 08/15/2014 Nov 16 16:26:45 kernel: RIP: 0010:rcu_sched_clock_irq+0xb2c/0x1120 Nov 16 16:26:45 kernel: Code: 38 08 00 00 85 c0 0f 84 f2 f5 ff ff e9 98 fc ff ff c6 87 39 08 00 00 01 e9 e1 f5 ff ff 4c 89 e7 e8 b9 8e f3 ff e9 0e ff ff ff> Nov 16 16:26:45 kernel: RSP: 0018:ffffc9000019ce08 EFLAGS: 00010082 Nov 16 16:26:45 kernel: RAX: 00000000ffffffc2 RBX: 0000000000000000 RCX: 0000000009e820b1 Nov 16 16:26:45 kernel: RDX: 000000000000c773 RSI: ffffffff9739b328 RDI: ffff8881bbe75180 Nov 16 16:26:45 kernel: RBP: ffff8888209a8200 R08: 0000000000000000 R09: 0000000000000000 Nov 16 16:26:45 kernel: R10: 0000000000000000 R11: ffffc9000019cff8 R12: ffff8888209aac80 Nov 16 16:26:45 kernel: R13: ffffc90000cabb98 R14: ffff8888209aac90 R15: ffff8888209aa740 Nov 16 16:26:45 kernel: FS: 00007fdb08868c00(0000) GS:ffff888820980000(0000) knlGS:0000000000000000 Nov 16 16:26:45 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Nov 16 16:26:45 kernel: CR2: 00007fdb060e8000 CR3: 0000000184474005 CR4: 00000000001706e0 Nov 16 16:26:45 kernel: Call Trace: Nov 16 16:26:45 kernel: <IRQ> Nov 16 16:26:45 kernel: ? rcu_sched_clock_irq+0xb2c/0x1120 Nov 16 16:26:45 kernel: ? __warn+0x81/0x130 Nov 16 16:26:45 kernel: ? rcu_sched_clock_irq+0xb2c/0x1120 Nov 16 16:26:45 kernel: ? report_bug+0x171/0x1a0 Nov 16 16:26:45 kernel: ? handle_bug+0x3c/0x80 Nov 16 16:26:45 kernel: ? exc_invalid_op+0x17/0x70 Nov 16 16:26:45 kernel: ? asm_exc_invalid_op+0x1a/0x20 Nov 16 16:26:45 kernel: ? rcu_sched_clock_irq+0xb2c/0x1120 Nov 16 16:26:45 kernel: ? load_balance+0x2e9/0xed0 Nov 16 16:26:45 kernel: ? reweight_entity+0x273/0x280 Nov 16 16:26:45 kernel: ? update_load_avg+0x7e/0x780 Nov 16 16:26:45 kernel: update_process_times+0x5f/0x90 Nov 16 16:26:45 kernel: tick_sched_handle+0x21/0x60 Nov 16 16:26:45 kernel: tick_sched_timer+0x6f/0x90 Nov 16 16:26:45 kernel: ? __pfx_tick_sched_timer+0x10/0x10 Nov 16 16:26:45 kernel: __hrtimer_run_queues+0x112/0x2b0 Nov 16 16:26:45 kernel: hrtimer_interrupt+0xf8/0x230 Nov 16 16:26:45 kernel: __sysvec_apic_timer_interrupt+0x50/0x140 Nov 16 16:26:45 kernel: sysvec_apic_timer_interrupt+0x6d/0x90 Nov 16 16:26:45 kernel: </IRQ> Nov 16 16:26:45 kernel: <TASK> Nov 16 16:26:45 kernel: asm_sysvec_apic_timer_interrupt+0x1a/0x20 Nov 16 16:26:45 kernel: RIP: 0010:rep_movs_alternative+0x4a/0x70 Nov 16 16:26:45 kernel: Code: 75 f1 c3 cc cc cc cc 66 0f 1f 84 00 00 00 00 00 48 8b 06 48 89 07 48 83 c6 08 48 83 c7 08 83 e9 08 74 df 83 f9 08 73 e8 eb c9> Nov 16 16:26:45 kernel: RSP: 0018:ffffc90000cabc48 EFLAGS: 00010206 Nov 16 16:26:45 kernel: RAX: 00007fdb060e9010 RBX: 0000000000001000 RCX: 00000000000005e0 Nov 16 16:26:45 kernel: RDX: 0000000000000000 RSI: ffff8883221dca20 RDI: 00007fdb060e8a30 Nov 16 16:26:45 kernel: RBP: 0000000000000000 R08: 0000000000000000 R09: 000000000135e000 Nov 16 16:26:45 kernel: R10: 000000000000000f R11: 000000000135e000 R12: ffffc90000cabe18 Nov 16 16:26:45 kernel: R13: 0000000000001000 R14: ffff8883221dc000 R15: 0000000000000000 Nov 16 16:26:45 kernel: copyout+0x20/0x30 Nov 16 16:26:45 kernel: _copy_to_iter+0x5e/0x4a0 Nov 16 16:26:45 kernel: copy_page_to_iter+0x8b/0x140 Nov 16 16:26:45 kernel: filemap_read+0x1af/0x320 Nov 16 16:26:45 kernel: vfs_read+0x1b8/0x300 Nov 16 16:26:45 kernel: ksys_read+0x67/0xe0 Nov 16 16:26:45 kernel: do_syscall_64+0x60/0x90 Nov 16 16:26:45 kernel: ? do_user_addr_fault+0x20f/0x660 Nov 16 16:26:45 kernel: ? exc_page_fault+0x71/0x160 Nov 16 16:26:45 kernel: entry_SYSCALL_64_after_hwframe+0x6e/0xd8 Nov 16 16:26:45 kernel: RIP: 0033:0x7fdb0830a3bc Nov 16 16:26:45 kernel: Code: ec 28 48 89 54 24 18 48 89 74 24 10 89 7c 24 08 e8 b7 18 f8 ff 48 8b 54 24 18 48 8b 74 24 10 41 89 c0 8b 7c 24 08 31 c0 0f 05> Nov 16 16:26:45 kernel: RSP: 002b:00007fff1393dc10 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 Nov 16 16:26:45 kernel: RAX: ffffffffffffffda RBX: 0000000000553f88 RCX: 00007fdb0830a3bc Nov 16 16:26:45 kernel: RDX: 0000000000553f88 RSI: 00007fdb060aa010 RDI: 000000000000004c Nov 16 16:26:45 kernel: RBP: 000055754832d8c0 R08: 0000000000000000 R09: 0000000000000000 Nov 16 16:26:45 kernel: R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000553f88 Nov 16 16:26:45 kernel: R13: 0000000000000027 R14: 00007fdb060aa010 R15: 0000000000000001 Nov 16 16:26:45 kernel: </TASK> Nov 16 16:26:45 kernel: ---[ end trace 0000000000000000 ]--- If I boot up using Kernel 6.5.9.1 shutting down the same vm does not cause those same issues. -- Regards, Joe
On 11/16/23 17:55, Joe Salmeri wrote:
Nov 16 16:55:24 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-...D } 18235 jiffies s: 1213 root: 0x10/. Nov 16 16:55:24 kernel: rcu: blocking rcu_node structures (internal RCU debug): Nov 16 16:55:24 kernel: Sending NMI from CPU 3 to CPUs 4: Nov 16 16:55:24 kernel: NMI backtrace for cpu 4 skipped: idling at intel_idle+0x62/0xb0
To me, that looks like a vmware problem. Larry
On 11/16/23 19:03, Larry Finger via openSUSE Factory wrote:
On 11/16/23 17:55, Joe Salmeri wrote:
Nov 16 16:55:24 kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 4-...D } 18235 jiffies s: 1213 root: 0x10/. Nov 16 16:55:24 kernel: rcu: blocking rcu_node structures (internal RCU debug): Nov 16 16:55:24 kernel: Sending NMI from CPU 3 to CPUs 4: Nov 16 16:55:24 kernel: NMI backtrace for cpu 4 skipped: idling at intel_idle+0x62/0xb0
To me, that looks like a vmware problem.
Larry
When I googled "linux RCU" it says Read-copy update (RCU) is*a scalable high-performance synchronization mechanism implemented in the Linux kernel*. This journal entry also made me think it is a kernel issue RIP: 0010:rcu_sched_clock_irq+0xb2c/0x1120 I was thinking it was the 6.6 kernel because it is supposed to include a new CPU scheduler which promises to improve performance and reduce latency and those messages, especially the RIP message sound like they might be related to that but I don't know if TW has it enabled, still researching that. -- Regards, Joe
participants (2)
-
Joe Salmeri
-
Larry Finger