On Fri 2018-04-06 20:07:12, Stefan Priebe - Profihost AG wrote:
Hello,
under memory pressure on a hypervisor running ksmd i had two deadlocks today where the machines rebootet due to lockups.
Kernel was build from commit: 4fe1fb26557d69a4d0397113eb89d2dd7d6021b4
2018-04-05 18:15:45 ksmtuned D ffff883e8d0afd08 0 2259 1 0x00080000 2018-04-05 18:15:45 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2018-04-05 18:15:45 Not tainted 4.4.126+557-ph #1 2018-04-05 18:15:45 INFO: task ksmtuned:2259 blocked for more than 120 seconds. 2018-04-05 18:15:45 [<ffffffff810a49f0>] ? kthread_park+0x60/0x60 2018-04-05 18:15:45 Leftover inexact backtrace: 2018-04-05 18:15:45 2018-04-05 18:15:45 DWARF2 unwinder stuck at ret_from_fork+0x55/0x80 2018-04-05 18:15:45 [<ffffffff816e60c5>] ret_from_fork+0x55/0x80 2018-04-05 18:15:45 [<ffffffff810a4add>] kthread+0xed/0x110 2018-04-05 18:15:45 [<ffffffff811b32c5>] ksm_scan_thread+0x85/0x1b0 2018-04-05 18:15:45 [<ffffffff811b2adb>] ksm_do_scan+0x69b/0xe00 2018-04-05 18:15:45 [<ffffffff8116b18d>] lru_add_drain_all+0x13d/0x190 2018-04-05 18:15:45 [<ffffffff8109ecae>] flush_work+0xfe/0x170 2018-04-05 18:15:45 [<ffffffff816e2998>] wait_for_completion+0xa8/0x110 2018-04-05 18:15:45 [<ffffffff816e4a4a>] schedule_timeout+0x23a/0x2d0 2018-04-05 18:15:45 [<ffffffff816e15f5>] schedule+0x35/0x80 2018-04-05 18:15:45 Call Trace: 2018-04-05 18:15:45 0000000000000038 ffff883f7fbb7c20 ffffffff816e15f5 7fffffffffffffff 2018-04-05 18:15:45 ffff883f7fbb8000 ffff883f7fbb7d58 ffff883f7fbb7d50 ffff883f7fbfa580 2018-04-05 18:15:45 ffff883f7fbb7c08 ffff887f7dad81c0 ffff883f00000031 ffff883f7fbfa580 2018-04-05 18:15:45 ksmd D ffff883f7fbb7c08 0 409 2 0x00080000 2018-04-05 18:15:45 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2018-04-05 18:15:45 Not tainted 4.4.126+557-ph #1 2018-04-05 18:15:45 INFO: task ksmd:409 blocked for more than 120 seconds. 2018-04-05 18:17:45 Leftover inexact backtrace: 2018-04-05 18:17:45 2018-04-05 18:17:45 DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xc9 2018-04-05 18:17:45 [<ffffffff816e5c85>] entry_SYSCALL_64_fastpath+0x1e/0xc9 2018-04-05 18:17:45 [<ffffffff811d71c6>] SyS_write+0x46/0xa0 2018-04-05 18:17:45 [<ffffffff811d60a9>] vfs_write+0xa9/0x190 2018-04-05 18:17:45 [<ffffffff811d548b>] __vfs_write+0x2b/0x130 2018-04-05 18:17:45 [<ffffffff81250903>] kernfs_fop_write+0x143/0x180 2018-04-05 18:17:45 [<ffffffff81250d5c>] sysfs_kf_write+0x3c/0x50 2018-04-05 18:17:45 [<ffffffff813d44d2>] kobj_attr_store+0x12/0x20 2018-04-05 18:17:45 [<ffffffff811b21c8>] run_store+0x48/0x2c0 2018-04-05 18:17:45 [<ffffffff816e33b7>] mutex_lock+0x17/0x30 2018-04-05 18:17:45 [<ffffffff816e3325>] __mutex_lock_slowpath+0x95/0x110 2018-04-05 18:17:45 [<ffffffff816e18fe>] schedule_preempt_disabled+0xe/0x10 2018-04-05 18:17:45 [<ffffffff816e15f5>] schedule+0x35/0x80
BTW: I am curious that the backtrace is in the reverse order. It is pretty consusing. Has anyone seen this yet? I wonder which is the order of the other messages. I mean if this backtrace is related to the above: 2018-04-05 18:15:45 INFO: task ksmd:409 blocked for more than 120 seconds. or the below: 2018-04-05 18:17:45 ksmtuned D ffff883e8d0afd08 0 2259 1 0x00080000 Best Regards, Petr
2018-04-05 18:17:45 Call Trace: 2018-04-05 18:17:45 ffffffff81e8d588 ffff883e8d0afd20 ffffffff816e15f5 ffffffff81e8d580 2018-04-05 18:17:45 ffff883e8d0b0000 ffffffff81e8d584 ffff887f7a228000 00000000ffffffff 2018-04-05 18:17:45 ffff883e8d0afd08 ffffffff811d3ca2 ffff883e0000001e ffff887f7a228000 2018-04-05 18:17:45 ksmtuned D ffff883e8d0afd08 0 2259 1 0x00080000 2018-04-05 18:17:45 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2018-04-05 18:17:45 Not tainted 4.4.126+557-ph #1 2018-04-05 18:17:45 INFO: task ksmtuned:2259 blocked for more than 120 seconds. 2018-04-05 18:17:45 [<ffffffff810a49f0>] ? kthread_park+0x60/0x60 2018-04-05 18:17:45 Leftover inexact backtrace: 2018-04-05 18:17:45 2018-04-05 18:17:45 DWARF2 unwinder stuck at ret_from_fork+0x55/0x80 2018-04-05 18:17:45 [<ffffffff816e60c5>] ret_from_fork+0x55/0x80 2018-04-05 18:17:45 [<ffffffff810a4add>] kthread+0xed/0x110 2018-04-05 18:17:45 [<ffffffff811b32c5>] ksm_scan_thread+0x85/0x1b0 2018-04-05 18:17:45 [<ffffffff811b2adb>] ksm_do_scan+0x69b/0xe00 2018-04-05 18:17:45 [<ffffffff8116b18d>] lru_add_drain_all+0x13d/0x190 2018-04-05 18:17:45 [<ffffffff8109ecae>] flush_work+0xfe/0x170 2018-04-05 18:17:45 [<ffffffff816e2998>] wait_for_completion+0xa8/0x110 2018-04-05 18:17:45 [<ffffffff816e4a4a>] schedule_timeout+0x23a/0x2d0 2018-04-05 18:17:45 [<ffffffff816e15f5>] schedule+0x35/0x80 2018-04-05 18:17:45 Call Trace: 2018-04-05 18:17:45 0000000000000038 ffff883f7fbb7c20 ffffffff816e15f5 7fffffffffffffff 2018-04-05 18:17:45 ffff883f7fbb8000 ffff883f7fbb7d58 ffff883f7fbb7d50 ffff883f7fbfa580 2018-04-05 18:17:45 ffff883f7fbb7c08 ffff887f7dad81c0 ffff883f00000031 ffff883f7fbfa580 2018-04-05 18:17:45 ksmd D ffff883f7fbb7c08 0 409 2 0x00080000 2018-04-05 18:17:45 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 2018-04-05 18:17:45 INFO: task ksmd:409 blocked for more than 120 seconds.
Greets, Stefan -- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org
-- To unsubscribe, e-mail: opensuse-kernel+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-kernel+owner@opensuse.org