On 06/07/2018 12:40 AM, Per Jessen wrote:
perf: interrupt took too long (10078 > 2500), lowering kernel.perf_event_max_sample_rate to 19800 I see that all the time ? When I googled it very recently, I think the consensus was not to worry.
Well, generally I don't, and LKML doesn't seem to be concerned with it, the only problem for me is the set of dumps and checks adds about 2 min.+ to the init of the box as looping though all cores sending NMI to every other core. Like I said, except for this one quirky board, (and even with this one after the delay) -- no problems whatsoever otherwise. Strange init issue though -- it showed up in one of the early 4.16.x releases, e.g. (1st 2 lines for context -- all OK to there) [ 0.464392] NET: Registered protocol family 1 [ 0.464769] PCI Interrupt Link [LUB0] enabled at IRQ 23 [ 10.412957] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1.718 msecs [ 10.412957] perf: interrupt took too long (10078 > 2500), lowering kernel.perf_event_max_sample_rate to 19800 [ 10.466267] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 3.398 msecs [ 10.466267] perf: interrupt took too long (23235 > 12597), lowering kernel.perf_event_max_sample_rate to 8400 [ 11.139636] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 38.086 msecs [ 11.138340] perf: interrupt took too long (979330 > 911352), lowering kernel.perf_event_max_sample_rate to 300 [ 11.391081] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 72.692 msecs [ 11.139636] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 381.378 msecs [ 12.138796] perf: interrupt took too long (1897735 > 1224162), lowering kernel.perf_event_max_sample_rate to 300 [ 25.142864] INFO: NMI handler (perf_event_nmi_handler) took too long to run: 1947.833 msecs [ 28.140139] perf: interrupt took too long (13755889 > 2372168), lowering kernel.perf_event_max_sample_rate to 300 [ 28.138850] watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [swapper/0:0] [ 28.138850] Modules linked in: [ 28.138850] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.16.13-1-ARCH #1 [ 86.138261] Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014 11/06/2009 [ 90.139366] RIP: 0010:__do_softirq+0x75/0x2e0 [ 90.139366] RSP: 0018:ffff9b36e7c03f70 EFLAGS: 00000297 ORIG_RAX: ffffffffffffff12 [ 90.139366] RAX: 0000000000021000 RBX: ffff9b36e7c15f00 RCX: 000000003543891f [ 91.138917] INFO: NMI handler (nmi_cpu_backtrace_handler) took too long to run: 467.840 msecs [ 91.138917] RDX: 0000000000000000 RSI: 000000003543891f RDI: 0000000000000100 [ 91.138917] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000078 [ 91.138917] R10: 000000005b1397ce R11: 0000000000000000 R12: 0000000000000000 [ 91.138917] R13: 0000000000000000 R14: 0000000000000282 R15: 0000000000000000 [ 91.138917] FS: 0000000000000000(0000) GS:ffff9b36e7c00000(0000) knlGS:0000000000000000 [ 91.138917] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 91.138917] NMI watchdog: Watchdog detected hard LOCKUP on cpu 4 [ 92.138146] CR2: ffff9b33d2655000 CR3: 000000011200a000 CR4: 00000000000006f0 [ 92.138146] Call Trace: [ 92.138146] <IRQ> [ 92.138146] irq_exit+0xc9/0xe0 [ 93.138944] smp_apic_timer_interrupt+0x73/0x160 [ 94.137773] Modules linked in: [ 94.137772] apic_timer_interrupt+0xf/0x20 [ 95.137578] </IRQ> [ 95.137578] RIP: 0010:native_safe_halt+0x2/0x10 [ 95.137578] RSP: 0018:ffffffffbf003ea0 EFLAGS: 00000202 [ 96.136670] CPU: 4 PID: 0 Comm: swapper/4 Not tainted 4.16.13-1-ARCH #1 [ 97.138510] Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014 11/06/2009 [ 97.138081] ORIG_RAX: ffffffffffffff12 <snip another 900 lines> [ 151.438482] Sending NMI from CPU 1 to CPUs 3: [ 151.469721] NMI backtrace for cpu 3 [ 151.469722] CPU: 3 PID: 0 Comm: swapper/3 Tainted: G L 4.16.13-1-ARCH #1 [ 151.469723] Hardware name: Supermicro H8DM8-2/H8DM8-2, BIOS 080014 11/06/2009 [ 151.469724] RIP: 0010:native_queued_spin_lock_slowpath+0x19a/0x1c0 [ 151.469725] RSP: 0018:ffff9b36e7cc3ef8 EFLAGS: 00000046 [ 151.469726] RAX: 0000000000000000 RBX: ffffffffbf051640 RCX: ffff9b36e7ce2480 [ 151.469726] RDX: 0000000000000002 RSI: ffffffffbee70e1c RDI: 0000000000100000 [ 151.469727] RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000101 [ 151.469728] R10: 0000000000338518 R11: 0000000000000000 R12: ffff9b36e7ce2480 [ 151.469728] R13: ffffffffbeed81f2 R14: 0000000000000009 R15: 0000000000000000 [ 151.469729] FS: 0000000000000000(0000) GS:ffff9b36e7cc0000(0000) knlGS:0000000000000000 [ 151.469730] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 151.469731] CR2: 0000000000000000 CR3: 000000011200a000 CR4: 00000000000006e0 [ 151.469731] Call Trace: [ 151.469732] <IRQ> [ 151.469733] _raw_spin_lock_irqsave+0x44/0x50 [ 151.469735] rcu_process_callbacks+0xce/0x3b0 [ 151.469737] __do_softirq+0xf1/0x2e0 [ 151.469739] irq_exit+0xc9/0xe0 [ 151.469741] smp_apic_timer_interrupt+0x73/0x160 [ 151.469743] apic_timer_interrupt+0xf/0x20 [ 151.469744] </IRQ> [ 151.469745] RIP: 0010:native_safe_halt+0x2/0x10 [ 151.469745] RSP: 0018:ffff9d6a431bbec0 EFLAGS: 00000202 ORIG_RAX: ffffffffffffff12 [ 151.469747] RAX: 0000000000000003 RBX: 0000000000000003 RCX: 0000000000000000 [ 151.469747] RDX: 0000000000000000 RSI: ffffffffbee70e1c RDI: ffffffffbee7107a [ 151.469748] RBP: ffffffffbf132b60 R08: 0000000000000000 R09: 0000000000000000 [ 151.469748] R10: 0000000000000000 R11: 0000000000000000 R12: ffffffffbeed81f2 [ 151.469749] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 151.469752] default_idle+0x18/0x100 [ 151.469753] do_idle+0x179/0x1b0 [ 151.469755] cpu_startup_entry+0x6f/0x80 [ 151.469757] start_secondary+0x1aa/0x200 [ 151.469758] secondary_startup_64+0xa5/0xb0 [ 151.469760] Code: 12 83 e0 03 83 ea 01 48 c1 e0 04 48 63 d2 48 05 80 24 02 00 48 03 04 d5 80 04 ef be 48 89 08 8b 41 08 85 c0 75 09 f3 90 8b 41 08 <85> c0 74 f7 48 8b 01 48 85 c0 0f 84 54 ff ff ff 49 89 c0 0f 0d [ 151.438482] Sending NMI from CPU 1 to CPUs 4: [ 151.470739] NMI backtrace for cpu 4 skipped: idling at native_safe_halt+0x2/0x10 [ 151.438482] Sending NMI from CPU 1 to CPUs 5: [ 151.470007] NMI backtrace for cpu 5 skipped: idling at native_safe_halt+0x2/0x10 [ 151.438482] Sending NMI from CPU 1 to CPUs 7: [ 151.471011] NMI backtrace for cpu 7 skipped: idling at native_safe_halt+0x2/0x10 [ 151.474216] PCI Interrupt Link [LUB2] enabled at IRQ 22 [ 151.490261] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490302] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490342] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490383] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490428] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490477] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490527] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490581] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490636] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490694] pci 0000:00:00.0: Found enabled HT MSI Mapping [ 151.490723] pci 0000:01:05.0: Video device with shadowed ROM at [mem 0x000c0000-0x000dffff] [ 151.490725] PCI: CLS 64 bytes, default 64 [ 151.490783] Unpacking initramfs... [ 151.620261] Freeing initrd memory: 7660K [ 151.620268] PCI-DMA: Disabling AGP. All good from here on out... -- David C. Rankin, J.D.,P.E. -- To unsubscribe, e-mail: opensuse+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse+owner@opensuse.org