[oS-en] kernel problem, machine locks

-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi, I was watching tv, was going to bed, went to hibernate computer: it was beeping every few seconds, and keyboard was locked. I had to hit hardware reset. I saw this on the log, going on for about two hours: Feb 17 00:27:05 Telcontar getnews.timer[20191]: Fetchnews run ok. Feb 17 00:27:05 Telcontar systemd[1]: leafnode-hourly.service: Deactivated successfully. Feb 17 00:27:47 Telcontar kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 1218883831 wd_nsec: 1218882986 Feb 17 00:27:56 Telcontar kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 3150770243 wd_nsec: 3150768044 Feb 17 00:28:42 Telcontar kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* ring sdma0 timeout, signaled seq=14910, emitted seq=14910 Feb 17 00:28:42 Telcontar kernel: [drm:amdgpu_job_timedout [amdgpu]] *ERROR* Process information: process pid 0 thread pid 0 Feb 17 00:28:42 Telcontar kernel: ------------[ cut here ]------------ Feb 17 00:28:42 Telcontar kernel: WARNING: CPU: 3 PID: 19833 at ../include/linux/dma-fence.h:580 amdgpu_job_timedout+0x1f8/0x230 [amdgpu] Feb 17 00:28:42 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 00:28:42 Telcontar kernel: raid6_pq libcrc32c dm_mod amdgpu intel_rapl_msr ppdev snd_hda_codec_realtek snd_hda_codec_generic battery snd_hda_codec_hdmi amd_atl drm_exec intel_rapl_common amdxcp drm_buddy edac_mce_amd gpu_sched snd_hda_intel i2c_algo_bit drm_suballoc_helper snd_intel_dspcfg snd_intel_sdw_acpi drm_display_helper r8169 kvm_amd snd_hda_codec drm_ttm_h> Feb 17 00:28:42 Telcontar kernel: scsi_dh_rdac scsi_dh_alua crypto_simd nvme_auth sg t10_pi cryptd ccp usbcore crc64_rocksoft_generic scsi_mod sp5100_tco(n) crc64_rocksoft crc64 wmi msr efivarfs Feb 17 00:28:42 Telcontar kernel: Unloaded tainted modules: msi_ec(n):1 Feb 17 00:28:42 Telcontar kernel: Supported: No, Unsupported modules are loaded Feb 17 00:28:42 Telcontar kernel: CPU: 3 PID: 19833 Comm: kworker/u64:0 Tainted: G OE n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 00:28:42 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 00:28:42 Telcontar kernel: Workqueue: amdgpu-reset-dev drm_sched_job_timedout [gpu_sched] Feb 17 00:28:42 Telcontar kernel: RIP: 0010:amdgpu_job_timedout+0x1f8/0x230 [amdgpu] Feb 17 00:28:42 Telcontar kernel: Code: 8b 7b 88 48 8d 55 80 4c 89 e6 e8 d3 9d db ff 85 c0 0f 84 3e ff ff ff 89 c6 48 c7 c7 18 47 df c1 e8 8d e8 b8 fa e9 2b ff ff ff <0f> 0b e9 f2 fe ff ff e8 4c 40 f3 fa 49 8b 44 24 18 48 c7 c6 10 b2 Feb 17 00:28:42 Telcontar kernel: RSP: 0018:ffffabe563f3fdc0 EFLAGS: 00010202 Feb 17 00:28:42 Telcontar kernel: RAX: ffff91148029f780 RBX: ffff911316e24ef8 RCX: 0000000000000027 Feb 17 00:28:42 Telcontar kernel: RDX: 0000000000000001 RSI: 00000000fffeffff RDI: ffff91215e1a3508 Feb 17 00:28:42 Telcontar kernel: RBP: ffffabe563f3fe50 R08: 0000000000000000 R09: c0000000fffeffff Feb 17 00:28:42 Telcontar kernel: R10: ffffabe563f3fe70 R11: ffffabe563f3fb90 R12: ffff9113a7a33000 Feb 17 00:28:42 Telcontar kernel: R13: ffff911316e00000 R14: ffff9113a7a33000 R15: ffff91118b08d900 Feb 17 00:28:42 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e180000(0000) knlGS:0000000000000000 Feb 17 00:28:42 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 00:28:42 Telcontar kernel: CR2: 00007fc68afe5fa0 CR3: 0000000227544000 CR4: 0000000000350ee0 Feb 17 00:28:42 Telcontar kernel: Call Trace: Feb 17 00:28:42 Telcontar kernel: <TASK> Feb 17 00:28:42 Telcontar kernel: ? __warn+0x7d/0x140 Feb 17 00:28:42 Telcontar kernel: ? amdgpu_job_timedout+0x1f8/0x230 [amdgpu 90e289b9939e2e42904549b39c68f0f830f1416f] Feb 17 00:28:42 Telcontar kernel: ? report_bug+0xfb/0x1e0 Feb 17 00:28:42 Telcontar kernel: ? handle_bug+0x44/0x80 Feb 17 00:28:42 Telcontar kernel: ? exc_invalid_op+0x13/0x60 Feb 17 00:28:42 Telcontar kernel: ? asm_exc_invalid_op+0x16/0x20 Feb 17 00:28:42 Telcontar kernel: ? amdgpu_job_timedout+0x1f8/0x230 [amdgpu 90e289b9939e2e42904549b39c68f0f830f1416f] Feb 17 00:28:42 Telcontar kernel: ? __update_idle_core+0x5d/0xc0 Feb 17 00:28:42 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:28:42 Telcontar kernel: ? finish_task_switch+0x8a/0x2d0 Feb 17 00:28:42 Telcontar kernel: ? drm_sched_job_timedout+0x68/0xd0 [gpu_sched 25239add3a906380d8957fb09d365791faaca961] Feb 17 00:28:42 Telcontar kernel: ? __pfx_amdgpu_job_timedout+0x10/0x10 [amdgpu 90e289b9939e2e42904549b39c68f0f830f1416f] Feb 17 00:28:42 Telcontar kernel: drm_sched_job_timedout+0x68/0xd0 [gpu_sched 25239add3a906380d8957fb09d365791faaca961] Feb 17 00:28:42 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 00:28:42 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:28:42 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 00:28:42 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:28:42 Telcontar kernel: kthread+0xe1/0x120 Feb 17 00:28:42 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 00:28:42 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 00:28:42 Telcontar kernel: </TASK> Feb 17 00:28:42 Telcontar kernel: ---[ end trace 0000000000000000 ]--- Feb 17 00:28:42 Telcontar kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset begin! Feb 17 00:29:02 Telcontar kernel: clocksource: Long readout interval, skipping watchdog check: cs_nsec: 11406283928 wd_nsec: 11406277123 Feb 17 00:29:05 Telcontar kernel: amdgpu 0000:27:00.0: amdgpu: Guilty job already signaled, skipping HW reset Feb 17 00:29:05 Telcontar kernel: amdgpu 0000:27:00.0: amdgpu: GPU reset(1) succeeded! Feb 17 00:30:01 Telcontar CRON[20396]: (root) CMD ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1) Feb 17 00:30:01 Telcontar CRON[20393]: (root) CMDEND ([ -x /usr/lib64/sa/sa1 ] && exec /usr/lib64/sa/sa1 1 1) Feb 17 00:30:43 Telcontar systemd[1]: Started Session c101 of User cer. Feb 17 00:34:43 Telcontar fetchnews[20707]: fetchnews: 0 articles and 0 headers fetched, 0 killed, 0 posted, in 143 seconds Feb 17 00:34:45 Telcontar getnews.timer[20849]: Fetchnews run ok. Feb 17 00:34:45 Telcontar systemd[1]: leafnode-hourly.service: Deactivated successfully. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: The canary thread is apparently starving. Taking action. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: Demoting known real-time threads. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: Successfully demoted thread 29270 of process 28237. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: Successfully demoted thread 27128 of process 27015. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: Successfully demoted thread 26748 of process 26598. Feb 17 00:34:49 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: The canary thread is apparently starving. Taking action. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: Demoting known real-time threads. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: Successfully demoted thread 29270 of process 28237. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: Successfully demoted thread 27128 of process 27015. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: Successfully demoted thread 26748 of process 26598. Feb 17 00:35:20 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: The canary thread is apparently starving. Taking action. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: Demoting known real-time threads. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: Successfully demoted thread 29270 of process 28237. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: Successfully demoted thread 27128 of process 27015. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: Successfully demoted thread 26748 of process 26598. Feb 17 00:36:28 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:38:35 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [kworker/u64:4:19295] Feb 17 00:38:35 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 00:38:35 Telcontar kernel: raid6_pq libcrc32c dm_mod amdgpu intel_rapl_msr ppdev snd_hda_codec_realtek snd_hda_codec_generic battery snd_hda_codec_hdmi amd_atl drm_exec intel_rapl_common amdxcp drm_buddy edac_mce_amd gpu_sched snd_hda_intel i2c_algo_bit drm_suballoc_helper snd_intel_dspcfg snd_intel_sdw_acpi drm_display_helper r8169 kvm_amd snd_hda_codec drm_ttm_h> Feb 17 00:38:35 Telcontar kernel: scsi_dh_rdac scsi_dh_alua crypto_simd nvme_auth sg t10_pi cryptd ccp usbcore crc64_rocksoft_generic scsi_mod sp5100_tco(n) crc64_rocksoft crc64 wmi msr efivarfs Feb 17 00:38:35 Telcontar kernel: Unloaded tainted modules: msi_ec(n):1 Feb 17 00:38:35 Telcontar kernel: Supported: No, Unsupported modules are loaded Feb 17 00:38:35 Telcontar kernel: CPU: 6 PID: 19295 Comm: kworker/u64:4 Tainted: G W OE n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 00:38:35 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 00:38:35 Telcontar kernel: Workqueue: writeback wb_workfn (flush-259:0) Feb 17 00:38:35 Telcontar kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x238/0x2c0 Feb 17 00:38:35 Telcontar kernel: Code: 83 c5 01 41 c1 e4 10 41 c1 e5 12 45 09 ec 44 89 e0 c1 e8 10 66 87 45 02 89 c2 c1 e2 10 85 d2 75 39 31 d2 eb 02 f3 90 8b 45 00 <66> 85 c0 75 f6 89 c1 66 31 c9 41 39 cc 74 5f 48 85 d2 c6 45 00 01 Feb 17 00:38:35 Telcontar kernel: RSP: 0018:ffffabe56315fad0 EFLAGS: 00000202 Feb 17 00:38:35 Telcontar kernel: RAX: 0000000000180001 RBX: ffff91215e336f00 RCX: 0000000000000000 Feb 17 00:38:35 Telcontar kernel: RDX: ffff91215e436f00 RSI: 0000000000100000 RDI: ffff911361408458 Feb 17 00:38:35 Telcontar kernel: RBP: ffff911361408458 R08: ffff911361408428 R09: 0000000000035900 Feb 17 00:38:35 Telcontar kernel: R10: ffffabe54b9efb78 R11: 00000000000002af R12: 00000000001c0000 Feb 17 00:38:35 Telcontar kernel: R13: 00000000001c0000 R14: ffffabe56315fb50 R15: 00000001010c731b Feb 17 00:38:35 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e300000(0000) knlGS:0000000000000000 Feb 17 00:38:35 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 00:38:35 Telcontar kernel: CR2: 00007ff08ac07fc0 CR3: 00000003fbf20000 CR4: 0000000000350ee0 Feb 17 00:38:35 Telcontar kernel: Call Trace: Feb 17 00:38:35 Telcontar kernel: <IRQ> Feb 17 00:38:35 Telcontar kernel: ? watchdog_timer_fn+0x1ae/0x210 Feb 17 00:38:35 Telcontar kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 Feb 17 00:38:35 Telcontar kernel: ? __hrtimer_run_queues+0x111/0x280 Feb 17 00:38:35 Telcontar kernel: ? hrtimer_interrupt+0xe5/0x250 Feb 17 00:38:35 Telcontar kernel: ? __sysvec_apic_timer_interrupt+0x5a/0x120 Feb 17 00:38:35 Telcontar kernel: ? sysvec_apic_timer_interrupt+0x4b/0x90 Feb 17 00:38:35 Telcontar kernel: </IRQ> Feb 17 00:38:35 Telcontar kernel: <TASK> Feb 17 00:38:35 Telcontar kernel: ? asm_sysvec_apic_timer_interrupt+0x16/0x20 Feb 17 00:38:35 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x238/0x2c0 Feb 17 00:38:35 Telcontar kernel: _raw_spin_lock+0x25/0x30 Feb 17 00:38:35 Telcontar kernel: __wb_update_bandwidth+0x3c/0x1c0 Feb 17 00:38:35 Telcontar kernel: wb_update_bandwidth+0x4e/0x70 Feb 17 00:38:35 Telcontar kernel: do_writepages+0x18e/0x1b0 Feb 17 00:38:35 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 00:38:35 Telcontar kernel: __writeback_single_inode+0x41/0x350 Feb 17 00:38:35 Telcontar kernel: writeback_sb_inodes+0x1d7/0x470 Feb 17 00:38:35 Telcontar kernel: __writeback_inodes_wb+0x5f/0xd0 Feb 17 00:38:35 Telcontar kernel: wb_writeback+0x281/0x300 Feb 17 00:38:35 Telcontar kernel: wb_workfn+0x1f5/0x470 Feb 17 00:38:35 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:38:35 Telcontar kernel: ? __schedule+0x389/0x1540 Feb 17 00:38:35 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:38:35 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 00:38:35 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:38:35 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 00:38:35 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:38:35 Telcontar kernel: kthread+0xe1/0x120 Feb 17 00:38:35 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 00:38:35 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 00:38:35 Telcontar kernel: </TASK> Feb 17 00:38:36 Telcontar systemd[1]: Started fetching news messages spool hourly. Feb 17 00:38:37 Telcontar fetchnews[21098]: config: debugmode is 1001 Feb 17 00:38:37 Telcontar fetchnews[21098]: fetchnews: version 2.0.0.alpha202101; verbosity level is 0; debugging level is 1001 Feb 17 00:38:37 Telcontar fetchnews[21098]: fetchnews mode: get articles, get headers, get bodies, post articles Feb 17 00:38:37 Telcontar fetchnews[21098]: last active refetch 71 days ago. Feb 17 00:38:37 Telcontar fetchnews[21098]: lockfile_exists(timeout=5), fqdn="Telcontar.valinor" Feb 17 00:38:54 Telcontar systemd[1]: Starting Check if mainboard battery is Ok... Feb 17 00:39:01 Telcontar fetchnews[21098]: expiring interesting.groups Feb 17 00:39:01 Telcontar fetchnews[21098]: reading /var/spool/news/leaf.node/groupinfo and /etc/leafnode/local.groups Feb 17 00:39:01 Telcontar fetchnews[21098]: found 0 articles in in.coming. Feb 17 00:39:01 Telcontar fetchnews[21098]: News.Individual.NET: connecting to port nntp Feb 17 00:39:14 Telcontar systemd[1]: check-battery.service: Deactivated successfully. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: The canary thread is apparently starving. Taking action. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Demoting known real-time threads. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Successfully demoted thread 29270 of process 28237. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Successfully demoted thread 27128 of process 27015. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Successfully demoted thread 26748 of process 26598. Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:39:29 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/u64:4:19295] Feb 17 00:39:15 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:39:29 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 23s! [kworker/u64:4:19295] Feb 17 00:39:29 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 00:39:29 Telcontar kernel: raid6_pq libcrc32c dm_mod amdgpu intel_rapl_msr ppdev snd_hda_codec_realtek snd_hda_codec_generic battery snd_hda_codec_hdmi amd_atl drm_exec intel_rapl_common amdxcp drm_buddy edac_mce_amd gpu_sched snd_hda_intel i2c_algo_bit drm_suballoc_helper snd_intel_dspcfg snd_intel_sdw_acpi drm_display_helper r8169 kvm_amd snd_hda_codec drm_ttm_h> Feb 17 00:39:29 Telcontar kernel: scsi_dh_rdac scsi_dh_alua crypto_simd nvme_auth sg t10_pi cryptd ccp usbcore crc64_rocksoft_generic scsi_mod sp5100_tco(n) crc64_rocksoft crc64 wmi msr efivarfs Feb 17 00:39:29 Telcontar kernel: Unloaded tainted modules: msi_ec(n):1 Feb 17 00:39:29 Telcontar kernel: Supported: No, Unsupported modules are loaded Feb 17 00:39:29 Telcontar kernel: CPU: 6 PID: 19295 Comm: kworker/u64:4 Tainted: G W OEL n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 00:39:29 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 00:39:29 Telcontar kernel: Workqueue: writeback wb_workfn (flush-259:0) Feb 17 00:39:29 Telcontar kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 00:39:29 Telcontar kernel: Code: eb f5 41 83 c5 01 41 c1 e4 10 41 c1 e5 12 45 09 ec 44 89 e0 c1 e8 10 66 87 45 02 89 c2 c1 e2 10 85 d2 75 39 31 d2 eb 02 f3 90 <8b> 45 00 66 85 c0 75 f6 89 c1 66 31 c9 41 39 cc 74 5f 48 85 d2 c6 Feb 17 00:39:29 Telcontar kernel: RSP: 0018:ffffabe56315fad0 EFLAGS: 00000202 Feb 17 00:39:29 Telcontar kernel: RAX: 0000000000080001 RBX: ffff91215e336f00 RCX: 0000000000000000 Feb 17 00:39:29 Telcontar kernel: RDX: ffff91215e036f00 RSI: 0000000000300000 RDI: ffff911361408458 Feb 17 00:39:29 Telcontar kernel: RBP: ffff911361408458 R08: ffff911361408428 R09: 0000000000035900 Feb 17 00:39:29 Telcontar kernel: R10: ffffabe54bfb7858 R11: 00000000000002fa R12: 00000000001c0000 Feb 17 00:39:29 Telcontar kernel: R13: 00000000001c0000 R14: ffffabe56315fb50 R15: 00000001010ca091 Feb 17 00:39:29 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e300000(0000) knlGS:0000000000000000 Feb 17 00:39:29 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 00:39:29 Telcontar kernel: CR2: 000005d3fcea1000 CR3: 000000026fbee000 CR4: 0000000000350ee0 Feb 17 00:39:29 Telcontar kernel: Call Trace: Feb 17 00:39:29 Telcontar kernel: <IRQ> Feb 17 00:39:29 Telcontar kernel: ? watchdog_timer_fn+0x1ae/0x210 Feb 17 00:39:29 Telcontar kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 Feb 17 00:39:29 Telcontar kernel: ? __hrtimer_run_queues+0x111/0x280 Feb 17 00:39:29 Telcontar kernel: ? hrtimer_interrupt+0xe5/0x250 Feb 17 00:39:29 Telcontar kernel: ? __sysvec_apic_timer_interrupt+0x5a/0x120 Feb 17 00:39:29 Telcontar kernel: ? sysvec_apic_timer_interrupt+0x4b/0x90 Feb 17 00:39:29 Telcontar kernel: </IRQ> Feb 17 00:39:29 Telcontar kernel: <TASK> Feb 17 00:39:29 Telcontar kernel: ? asm_sysvec_apic_timer_interrupt+0x16/0x20 Feb 17 00:39:29 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 00:39:29 Telcontar kernel: _raw_spin_lock+0x25/0x30 Feb 17 00:39:29 Telcontar kernel: __wb_update_bandwidth+0x3c/0x1c0 Feb 17 00:39:29 Telcontar kernel: wb_update_bandwidth+0x4e/0x70 Feb 17 00:39:29 Telcontar kernel: do_writepages+0x18e/0x1b0 Feb 17 00:39:29 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 00:39:29 Telcontar kernel: __writeback_single_inode+0x41/0x350 Feb 17 00:39:29 Telcontar kernel: writeback_sb_inodes+0x1d7/0x470 Feb 17 00:39:29 Telcontar kernel: __writeback_inodes_wb+0x5f/0xd0 Feb 17 00:39:29 Telcontar kernel: wb_writeback+0x281/0x300 Feb 17 00:39:29 Telcontar kernel: wb_workfn+0x1f5/0x470 Feb 17 00:39:29 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:39:29 Telcontar kernel: ? __schedule+0x389/0x1540 Feb 17 00:39:29 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:39:29 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 00:39:29 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:39:29 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 00:39:29 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:39:29 Telcontar kernel: kthread+0xe1/0x120 Feb 17 00:39:29 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 00:39:29 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 00:39:29 Telcontar kernel: </TASK> Feb 17 00:39:21 Telcontar fetchnews[21098]: trying: address 130.133.4.11 port 119... Feb 17 00:39:23 Telcontar fetchnews[21098]: connected: address 130.133.4.11 port 119. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: The canary thread is apparently starving. Taking action. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: Demoting known real-time threads. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: Successfully demoted thread 29270 of process 28237. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: Successfully demoted thread 27128 of process 27015. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: Successfully demoted thread 26748 of process 26598. Feb 17 00:39:33 Telcontar rtkit-daemon[5348]: Demoted 3 threads. Feb 17 00:39:54 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 22s! [kworker/u64:4:19295] Feb 17 00:39:54 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 00:40:21 Telcontar kernel: raid6_pq libcrc32c dm_mod amdgpu intel_rapl_msr ppdev snd_hda_codec_realtek snd_hda_codec_generic battery snd_hda_codec_hdmi amd_atl drm_exec intel_rapl_common amdxcp drm_buddy edac_mce_amd gpu_sched snd_hda_intel i2c_algo_bit drm_suballoc_helper snd_intel_dspcfg snd_intel_sdw_acpi drm_display_helper r8169 kvm_amd snd_hda_codec drm_ttm_h> Feb 17 00:40:21 Telcontar kernel: scsi_dh_rdac scsi_dh_alua crypto_simd nvme_auth sg t10_pi cryptd ccp usbcore crc64_rocksoft_generic scsi_mod sp5100_tco(n) crc64_rocksoft crc64 wmi msr efivarfs Feb 17 00:40:21 Telcontar kernel: Unloaded tainted modules: msi_ec(n):1 Feb 17 00:40:21 Telcontar kernel: Supported: No, Unsupported modules are loaded Feb 17 00:40:21 Telcontar kernel: CPU: 6 PID: 19295 Comm: kworker/u64:4 Tainted: G W OEL n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 00:40:21 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 00:40:21 Telcontar kernel: Workqueue: writeback wb_workfn (flush-259:0) Feb 17 00:40:21 Telcontar kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 00:40:21 Telcontar kernel: Code: c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 05 48 63 d2 48 05 00 6f 03 00 48 03 04 d5 60 1c ec bc 48 89 18 8b 43 08 85 c0 75 09 f3 90 <8b> 43 08 85 c0 74 f7 48 8b 13 48 85 d2 74 94 0f 0d 0a eb 8f b9 01 Feb 17 00:40:21 Telcontar kernel: RSP: 0018:ffffabe56315fad0 EFLAGS: 00000246 Feb 17 00:40:21 Telcontar kernel: RAX: 0000000000000000 RBX: ffff91215e336f00 RCX: 0000000000000000 Feb 17 00:40:21 Telcontar kernel: RDX: 0000000000000004 RSI: 0000000000140000 RDI: ffff911361408458 Feb 17 00:40:21 Telcontar kernel: RBP: ffff911361408458 R08: ffff911361408428 R09: 000000000000983c Feb 17 00:40:21 Telcontar kernel: R10: ffffabe56547fbc0 R11: 000000000000023e R12: 00000000001c0000 Feb 17 00:40:21 Telcontar kernel: R13: 00000000001c0000 R14: ffffabe56315fb50 R15: 00000001010cb8f6 Feb 17 00:40:21 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e300000(0000) knlGS:0000000000000000 Feb 17 00:40:21 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 00:40:21 Telcontar kernel: CR2: 00007fe9989b9014 CR3: 00000002584c0000 CR4: 0000000000350ee0 Feb 17 00:40:21 Telcontar kernel: Call Trace: Feb 17 00:40:21 Telcontar kernel: <IRQ> Feb 17 00:40:21 Telcontar kernel: ? watchdog_timer_fn+0x1ae/0x210 Feb 17 00:40:21 Telcontar kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 Feb 17 00:40:21 Telcontar kernel: ? __hrtimer_run_queues+0x111/0x280 Feb 17 00:40:21 Telcontar kernel: ? hrtimer_interrupt+0xe5/0x250 Feb 17 00:40:21 Telcontar kernel: ? __sysvec_apic_timer_interrupt+0x5a/0x120 Feb 17 00:40:21 Telcontar kernel: ? sysvec_apic_timer_interrupt+0x4b/0x90 Feb 17 00:40:21 Telcontar kernel: </IRQ> Feb 17 00:40:21 Telcontar kernel: <TASK> Feb 17 00:40:21 Telcontar kernel: ? asm_sysvec_apic_timer_interrupt+0x16/0x20 Feb 17 00:40:21 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 00:40:21 Telcontar kernel: _raw_spin_lock+0x25/0x30 Feb 17 00:40:21 Telcontar kernel: __wb_update_bandwidth+0x3c/0x1c0 Feb 17 00:40:21 Telcontar kernel: wb_update_bandwidth+0x4e/0x70 Feb 17 00:40:21 Telcontar kernel: do_writepages+0x18e/0x1b0 Feb 17 00:40:21 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 00:40:21 Telcontar kernel: __writeback_single_inode+0x41/0x350 Feb 17 00:40:21 Telcontar kernel: writeback_sb_inodes+0x1d7/0x470 Feb 17 00:40:21 Telcontar kernel: __writeback_inodes_wb+0x5f/0xd0 Feb 17 00:40:21 Telcontar kernel: wb_writeback+0x281/0x300 Feb 17 00:40:21 Telcontar kernel: wb_workfn+0x1f5/0x470 Feb 17 00:40:21 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:40:21 Telcontar kernel: ? __schedule+0x389/0x1540 Feb 17 00:40:21 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 00:40:21 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 00:40:21 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:40:21 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 00:40:21 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 00:40:21 Telcontar kernel: kthread+0xe1/0x120 Feb 17 00:40:21 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 00:40:21 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 00:40:21 Telcontar kernel: </TASK> Feb 17 00:40:21 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [kworker/u64:4:19295] Feb 17 00:40:21 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> ... Feb 17 02:38:07 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 02:38:07 Telcontar kernel: </TASK> Feb 17 02:38:07 Telcontar kernel: watchdog: BUG: soft lockup - CPU#4 stuck for 21s! [kworker/4:1:20203] Feb 17 02:38:07 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 02:38:07 Telcontar kernel: raid6_pq libcrc32c dm_mod amdgpu intel_rapl_msr ppdev snd_hda_codec_realtek snd_hda_codec_generic battery snd_hda_codec_hdmi amd_atl drm_exec intel_rapl_common amdxcp drm_buddy edac_mce_amd gpu_sched snd_hda_intel i2c_algo_bit drm_suballoc_helper snd_intel_dspcfg snd_intel_sdw_acpi drm_display_helper r8169 kvm_amd snd_hda_codec drm_ttm_h> Feb 17 02:38:07 Telcontar kernel: scsi_dh_rdac scsi_dh_alua crypto_simd nvme_auth sg t10_pi cryptd ccp usbcore crc64_rocksoft_generic scsi_mod sp5100_tco(n) crc64_rocksoft crc64 wmi msr efivarfs Feb 17 02:38:07 Telcontar kernel: Unloaded tainted modules: msi_ec(n):1 Feb 17 02:38:07 Telcontar kernel: Supported: No, Unsupported modules are loaded Feb 17 02:38:07 Telcontar kernel: CPU: 4 PID: 20203 Comm: kworker/4:1 Tainted: G W OEL n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 02:38:07 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 02:38:07 Telcontar kernel: Workqueue: inode_switch_wbs inode_switch_wbs_work_fn Feb 17 02:38:07 Telcontar kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: Code: c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 05 48 63 d2 48 05 00 6f 03 00 48 03 04 d5 60 1c ec bc 48 89 18 8b 43 08 85 c0 75 09 f3 90 <8b> 43 08 85 c0 74 f7 48 8b 13 48 85 d2 74 94 0f 0d 0a eb 8f b9 01 Feb 17 02:38:07 Telcontar kernel: RSP: 0018:ffffabe56449fda0 EFLAGS: 00000246 Feb 17 02:38:07 Telcontar kernel: RAX: 0000000000000000 RBX: ffff91215e236f00 RCX: ffff9113d8951008 Feb 17 02:38:07 Telcontar kernel: RDX: 000000000000000a RSI: 00000000002c0000 RDI: ffff911361408458 Feb 17 02:38:07 Telcontar kernel: RBP: ffff911361408458 R08: ffff911312a82204 R09: ffff91215edfecc0 Feb 17 02:38:07 Telcontar kernel: R10: 8080808080808080 R11: 0000000000000010 R12: 0000000000140000 Feb 17 02:38:07 Telcontar kernel: R13: 0000000000140000 R14: ffff911361408400 R15: ffff91142ece3c00 Feb 17 02:38:07 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e200000(0000) knlGS:0000000000000000 Feb 17 02:38:07 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 02:38:07 Telcontar kernel: CR2: 0000276684f6c000 CR3: 0000000438afe000 CR4: 0000000000350ee0 Feb 17 02:38:07 Telcontar kernel: Call Trace: Feb 17 02:38:07 Telcontar kernel: <IRQ> Feb 17 02:38:07 Telcontar kernel: ? watchdog_timer_fn+0x1ae/0x210 Feb 17 02:38:07 Telcontar kernel: ? __pfx_watchdog_timer_fn+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: ? __hrtimer_run_queues+0x111/0x280 Feb 17 02:38:07 Telcontar kernel: ? hrtimer_interrupt+0xe5/0x250 Feb 17 02:38:07 Telcontar kernel: ? __sysvec_apic_timer_interrupt+0x5a/0x120 Feb 17 02:38:07 Telcontar kernel: ? sysvec_apic_timer_interrupt+0x4b/0x90 Feb 17 02:38:07 Telcontar kernel: </IRQ> Feb 17 02:38:07 Telcontar kernel: <TASK> Feb 17 02:38:07 Telcontar kernel: ? asm_sysvec_apic_timer_interrupt+0x16/0x20 Feb 17 02:38:07 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: _raw_spin_lock+0x25/0x30 Feb 17 02:38:07 Telcontar kernel: inode_switch_wbs_work_fn+0x647/0x770 Feb 17 02:38:07 Telcontar kernel: ? __schedule+0x389/0x1540 Feb 17 02:38:07 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 02:38:07 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 02:38:07 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: kthread+0xe1/0x120 Feb 17 02:38:07 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 02:38:07 Telcontar kernel: </TASK> Feb 17 02:38:07 Telcontar kernel: rcu: INFO: rcu_preempt detected expedited stalls on CPUs/tasks: { 3-.... } 19075 jiffies s: 5005 root: 0x1/. Feb 17 02:38:07 Telcontar kernel: rcu: blocking rcu_node structures (internal RCU debug): l=1:0-15:0x8/. Feb 17 02:38:07 Telcontar kernel: Sending NMI from CPU 9 to CPUs 3: Feb 17 02:38:07 Telcontar kernel: NMI backtrace for cpu 3 Feb 17 02:38:07 Telcontar kernel: CPU: 3 PID: 19295 Comm: kworker/u64:4 Tainted: G W OEL n 6.4.0-150600.23.38-default #1 SLE15-SP6 46e958e1e6a4044cfee7b3413eeabfe5a22d6494 Feb 17 02:38:07 Telcontar kernel: Hardware name: Micro-Star International Co., Ltd. MS-7B79/X470 GAMING PLUS MAX (MS-7B79), BIOS H.40 11/06/2019 Feb 17 02:38:07 Telcontar kernel: Workqueue: writeback wb_workfn (flush-259:0) Feb 17 02:38:07 Telcontar kernel: RIP: 0010:native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: Code: c1 ea 12 83 e0 03 83 ea 01 48 c1 e0 05 48 63 d2 48 05 00 6f 03 00 48 03 04 d5 60 1c ec bc 48 89 18 8b 43 08 85 c0 75 09 f3 90 <8b> 43 08 85 c0 74 f7 48 8b 13 48 85 d2 74 94 0f 0d 0a eb 8f b9 01 Feb 17 02:38:07 Telcontar kernel: RSP: 0018:ffffabe56315fad0 EFLAGS: 00000246 Feb 17 02:38:07 Telcontar kernel: RAX: 0000000000000000 RBX: ffff91215e1b6f00 RCX: 0000000000000000 Feb 17 02:38:07 Telcontar kernel: RDX: 0000000000000008 RSI: 0000000000240000 RDI: ffff911361408458 Feb 17 02:38:07 Telcontar kernel: RBP: ffff911361408458 R08: ffff911361408428 R09: 00000000000006c5 Feb 17 02:38:07 Telcontar kernel: R10: ffffabe54b5afb88 R11: 000000000000004b R12: 0000000000100000 Feb 17 02:38:07 Telcontar kernel: R13: 0000000000100000 R14: ffffabe56315fb50 R15: 000000010124576b Feb 17 02:38:07 Telcontar kernel: FS: 0000000000000000(0000) GS:ffff91215e180000(0000) knlGS:0000000000000000 Feb 17 02:38:07 Telcontar kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 Feb 17 02:38:07 Telcontar kernel: CR2: 000003418b86f000 CR3: 0000000146382000 CR4: 0000000000350ee0 Feb 17 02:38:07 Telcontar kernel: Call Trace: Feb 17 02:38:07 Telcontar kernel: <NMI> Feb 17 02:38:07 Telcontar kernel: ? nmi_cpu_backtrace+0x8d/0x100 Feb 17 02:38:07 Telcontar kernel: ? nmi_cpu_backtrace_handler+0xd/0x20 Feb 17 02:38:07 Telcontar kernel: ? nmi_handle+0x68/0x150 Feb 17 02:38:07 Telcontar kernel: ? default_do_nmi+0x49/0x100 Feb 17 02:38:07 Telcontar kernel: ? exc_nmi+0x1ca/0x270 Feb 17 02:38:07 Telcontar kernel: ? end_repeat_nmi+0x16/0x67 Feb 17 02:38:07 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x292/0x2c0 Feb 17 02:38:07 Telcontar kernel: </NMI> Feb 17 02:38:07 Telcontar kernel: <TASK> Feb 17 02:38:07 Telcontar kernel: _raw_spin_lock+0x25/0x30 Feb 17 02:38:07 Telcontar kernel: __wb_update_bandwidth+0x3c/0x1c0 Feb 17 02:38:07 Telcontar kernel: wb_update_bandwidth+0x4e/0x70 Feb 17 02:38:07 Telcontar kernel: do_writepages+0x18e/0x1b0 Feb 17 02:38:07 Telcontar kernel: ? native_queued_spin_lock_slowpath+0x235/0x2c0 Feb 17 02:38:07 Telcontar kernel: __writeback_single_inode+0x41/0x350 Feb 17 02:38:07 Telcontar kernel: writeback_sb_inodes+0x1d7/0x470 Feb 17 02:38:07 Telcontar kernel: __writeback_inodes_wb+0x5f/0xd0 Feb 17 02:38:07 Telcontar kernel: wb_writeback+0x281/0x300 Feb 17 02:38:07 Telcontar kernel: wb_workfn+0x1f5/0x470 Feb 17 02:38:07 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 02:38:07 Telcontar kernel: ? __schedule+0x389/0x1540 Feb 17 02:38:07 Telcontar kernel: ? srso_return_thunk+0x5/0x5f Feb 17 02:38:07 Telcontar kernel: process_one_work+0x226/0x460 Feb 17 02:38:07 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: worker_thread+0x2a/0x3b0 Feb 17 02:38:07 Telcontar kernel: ? __pfx_worker_thread+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: kthread+0xe1/0x120 Feb 17 02:38:07 Telcontar kernel: ? __pfx_kthread+0x10/0x10 Feb 17 02:38:07 Telcontar kernel: ret_from_fork+0x2c/0x50 Feb 17 02:38:07 Telcontar kernel: </TASK> Feb 17 02:38:07 Telcontar kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 24s! [kworker/1:1:20190] Feb 17 02:38:07 Telcontar kernel: Modules linked in: rpcsec_gss_krb5 nfsv4 dns_resolver nfs fscache netfs snd_seq_dummy snd_hrtimer vmw_vsock_vmci_transport vsock vmw_vmci xt_tcpudp nf_nat_ftp nf_nat_sip nf_conntrack_ftp nf_conntrack_sip nf_conntrack_netbios_ns nf_conntrack_broadcast nft_limit nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 n> Feb 17 02:38:41 Telcontar systemd-journald[25160]: Journal stopped - -- Cheers Carlos E. R. (from openSUSE Leap 15.6 at Legolas) -----BEGIN PGP SIGNATURE----- iHoEARECADoWIQQZEb51mJKK1KpcU/W1MxgcbY1H1QUCZ7KZYhwccm9iaW4ubGlz dGFzQHRlbGVmb25pY2EubmV0AAoJELUzGBxtjUfVXzUAn1T3LD9i4/zMMhbNgS3O EBQ4yavQAJ9o+5a0uzIoEOZTS4oh46c4ICJOiQ== =kIsD -----END PGP SIGNATURE-----

On 2025-02-17 03:05, Carlos E. R. wrote:
I have more data, it has happened more times. This is Leap 15.6. If I don't hibernate the machine, it does not happen. After hibernation, it happens in a day or two at about 00:39 hours: Feb 17 00:38:35 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [kworker/u64:4:19295] Feb 24 00:40:57 Telcontar kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u64:0:30616] Mar 03 00:39:57 Telcontar kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/u64:2:1587] Mar 10 00:39:30 Telcontar kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [kworker/u64:0:16139] There is no definite systemd timer or cron job that happens at that time (at least, none that leaves a trace in the log). There are a bunch of them running after midnight, of course, and some that run periodically and also soon before the event (and even after, the machine is not completely locked, just very busy). Before that, dovecot complains - I mark the one that is almost at the same time the kernel complains: 2.4> 2025-03-10T00:33:01.598589+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 0.832904 seconds - adjusting timeouts. <2.4> 2025-03-10T00:34:22.038132+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 0.703951 seconds - adjusting timeouts. <2.4> 2025-03-10T00:35:28.766473+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 2.484668 seconds - adjusting timeouts. <2.4> 2025-03-10T00:36:27.573159+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 1.210396 seconds - adjusting timeouts. <2.4> 2025-03-10T00:37:20.916857+01:00 Telcontar dovecot - - - imap(15679): Warning: Time jumped forwards 15.455737 seconds <==== <2.4> 2025-03-10T00:39:03.928147+01:00 Telcontar dovecot - - - imap(15679): Warning: Time jumped forwards 14.802280 seconds Keyboard is locked. Machine responds, albeit very slowly, over ssh from another machine; for instance, I can run top: top - 03:00:11 up 6 days, 4:27, 5 users, load average: 25.40, 28.60, 33.03 Tasks: 776 total, 8 running, 765 sleeping, 1 stopped, 2 zombie %Cpu(s): 1.2 us, 20.3 sy, 0.0 ni, 66.5 id, 12.1 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 64240.36+total, 12099.13+free, 22160.45+used, 30989.12+buff/cache MiB Swap: 102400.0+total, 88699.17+free, 13700.82+used. 42079.91+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20554 root 20 0 0 0 0 R 100.0 0.000 18:02.81 kworker/10:1+inode_switch_wbs 31821 root 20 0 0 0 0 R 100.0 0.000 52:02.71 kworker/4:2+inode_switch_wbs 19183 root 20 0 0 0 0 R 99.67 0.000 37:30.36 kworker/u64:6+flush-259:0 12690 cer 20 0 3490888 405728 80500 S 4.305 0.617 329:34.53 Isolated Web Co 12761 cer 20 0 3630956 442540 84676 S 1.987 0.673 186:45.13 Isolated Web Co ... Over ssh I can poweroff and thus recover logs and files. <https://bugzilla.opensuse.org/show_bug.cgi?id=1237776> -- Cheers / Saludos, Carlos E. R. (from 15.6 x86_64 at Telcontar)

On 3/10/25 2:27 PM, Carlos E. R. wrote:
Was this the daylight savings time change?? Linux will handle time fall back okay, but time going forward used to make the kernel freak out (technical term). I know the smart folks know how to make the time change work for spring, but slewing 15 seconds at a time doesn't seem like the way to go. The "kernel: RIP: 0010:amdgpu_job_timedout+0x1f8/0x230 [amdgpu]" seems to be what gets the ball rolling so to speak. I'm no AMD expert, but somebody should be able to decipher the hex string that follows that entry: kernel: Code: 8b 7b 88 48 8d 55 80 4c 89 e6 e8 d3 9d db ff 85 c0 0f 84 3e ff ff ff 89 c6 48 c7 c7 18 47 df c1 e8 8d e8 b8 fa e9 2b ff ff ff <0f> 0b e9 f2 fe ff ff e8 4c 40 f3 fa 49 8b 44 24 18 48 c7 c6 10 b2 -- David C. Rankin, J.D.,P.E.

On 2025-03-11 06:21, David C. Rankin wrote:
Nope. That change is due to happen on the 30th. Anyway, I have other times intentionally being using the computer during the time adjustment and I notice nothing. IMAP is the only process that notices these time jumps, and there are none other jumps in years of logs, except when the machine is hibernated. This is a kernel BUG, the machine is unusable and has to be restarted to recover.
Linux will handle time fall back okay, but time going forward used to make the kernel freak out (technical term).
That time jump would be 3600 seconds. This time jump is the CPU being stuck for a number of seconds. The IMAP process is hold up when should be processing, and when it connects again, the time has changed.
No, this is a time with a CPU core completely stuck. It is a kernel BUG, as stated in the log. The machine does not respond to keyboard at all, the numeric key pad key doesn't toggle the LED, top shows the CPU stuck at 100% busy (via ssh), and it takes minutes to respond to any command. The machine is in that state for hours, till I come by it and notice. -- Cheers / Saludos, Carlos E. R. (from 15.6 x86_64 at Telcontar)

On 2025-02-17 03:05, Carlos E. R. wrote:
I have more data, it has happened more times. This is Leap 15.6. If I don't hibernate the machine, it does not happen. After hibernation, it happens in a day or two at about 00:39 hours: Feb 17 00:38:35 Telcontar kernel: watchdog: BUG: soft lockup - CPU#6 stuck for 21s! [kworker/u64:4:19295] Feb 24 00:40:57 Telcontar kernel: watchdog: BUG: soft lockup - CPU#1 stuck for 22s! [kworker/u64:0:30616] Mar 03 00:39:57 Telcontar kernel: watchdog: BUG: soft lockup - CPU#9 stuck for 22s! [kworker/u64:2:1587] Mar 10 00:39:30 Telcontar kernel: watchdog: BUG: soft lockup - CPU#11 stuck for 22s! [kworker/u64:0:16139] There is no definite systemd timer or cron job that happens at that time (at least, none that leaves a trace in the log). There are a bunch of them running after midnight, of course, and some that run periodically and also soon before the event (and even after, the machine is not completely locked, just very busy). Before that, dovecot complains - I mark the one that is almost at the same time the kernel complains: 2.4> 2025-03-10T00:33:01.598589+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 0.832904 seconds - adjusting timeouts. <2.4> 2025-03-10T00:34:22.038132+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 0.703951 seconds - adjusting timeouts. <2.4> 2025-03-10T00:35:28.766473+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 2.484668 seconds - adjusting timeouts. <2.4> 2025-03-10T00:36:27.573159+01:00 Telcontar dovecot - - - master: Warning: Time moved forwards by 1.210396 seconds - adjusting timeouts. <2.4> 2025-03-10T00:37:20.916857+01:00 Telcontar dovecot - - - imap(15679): Warning: Time jumped forwards 15.455737 seconds <==== <2.4> 2025-03-10T00:39:03.928147+01:00 Telcontar dovecot - - - imap(15679): Warning: Time jumped forwards 14.802280 seconds Keyboard is locked. Machine responds, albeit very slowly, over ssh from another machine; for instance, I can run top: top - 03:00:11 up 6 days, 4:27, 5 users, load average: 25.40, 28.60, 33.03 Tasks: 776 total, 8 running, 765 sleeping, 1 stopped, 2 zombie %Cpu(s): 1.2 us, 20.3 sy, 0.0 ni, 66.5 id, 12.1 wa, 0.0 hi, 0.0 si, 0.0 st MiB Mem : 64240.36+total, 12099.13+free, 22160.45+used, 30989.12+buff/cache MiB Swap: 102400.0+total, 88699.17+free, 13700.82+used. 42079.91+avail Mem PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 20554 root 20 0 0 0 0 R 100.0 0.000 18:02.81 kworker/10:1+inode_switch_wbs 31821 root 20 0 0 0 0 R 100.0 0.000 52:02.71 kworker/4:2+inode_switch_wbs 19183 root 20 0 0 0 0 R 99.67 0.000 37:30.36 kworker/u64:6+flush-259:0 12690 cer 20 0 3490888 405728 80500 S 4.305 0.617 329:34.53 Isolated Web Co 12761 cer 20 0 3630956 442540 84676 S 1.987 0.673 186:45.13 Isolated Web Co ... Over ssh I can poweroff and thus recover logs and files. <https://bugzilla.opensuse.org/show_bug.cgi?id=1237776> -- Cheers / Saludos, Carlos E. R. (from 15.6 x86_64 at Telcontar)

On 3/10/25 2:27 PM, Carlos E. R. wrote:
Was this the daylight savings time change?? Linux will handle time fall back okay, but time going forward used to make the kernel freak out (technical term). I know the smart folks know how to make the time change work for spring, but slewing 15 seconds at a time doesn't seem like the way to go. The "kernel: RIP: 0010:amdgpu_job_timedout+0x1f8/0x230 [amdgpu]" seems to be what gets the ball rolling so to speak. I'm no AMD expert, but somebody should be able to decipher the hex string that follows that entry: kernel: Code: 8b 7b 88 48 8d 55 80 4c 89 e6 e8 d3 9d db ff 85 c0 0f 84 3e ff ff ff 89 c6 48 c7 c7 18 47 df c1 e8 8d e8 b8 fa e9 2b ff ff ff <0f> 0b e9 f2 fe ff ff e8 4c 40 f3 fa 49 8b 44 24 18 48 c7 c6 10 b2 -- David C. Rankin, J.D.,P.E.

On 2025-03-11 06:21, David C. Rankin wrote:
Nope. That change is due to happen on the 30th. Anyway, I have other times intentionally being using the computer during the time adjustment and I notice nothing. IMAP is the only process that notices these time jumps, and there are none other jumps in years of logs, except when the machine is hibernated. This is a kernel BUG, the machine is unusable and has to be restarted to recover.
Linux will handle time fall back okay, but time going forward used to make the kernel freak out (technical term).
That time jump would be 3600 seconds. This time jump is the CPU being stuck for a number of seconds. The IMAP process is hold up when should be processing, and when it connects again, the time has changed.
No, this is a time with a CPU core completely stuck. It is a kernel BUG, as stated in the log. The machine does not respond to keyboard at all, the numeric key pad key doesn't toggle the LED, top shows the CPU stuck at 100% busy (via ssh), and it takes minutes to respond to any command. The machine is in that state for hours, till I come by it and notice. -- Cheers / Saludos, Carlos E. R. (from 15.6 x86_64 at Telcontar)
participants (2)
-
Carlos E. R.
-
David C. Rankin