Bug ID | 994969 |
---|---|
Summary | NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [irq/32-opal-elo:397] |
Classification | openSUSE |
Product | openSUSE Tumbleweed |
Version | Current |
Hardware | PowerPC-64 |
OS | Other |
Status | NEW |
Severity | Normal |
Priority | P5 - None |
Component | Kernel |
Assignee | kernel-maintainers@forge.provo.novell.com |
Reporter | ro@suse.com |
QA Contact | qa-bugs@suse.de |
Found By | --- |
Blocker | --- |
since about kernel 4.0 I have not been able to boot a power7 machine with OPAL/bare-metal. Hang usually starts after scsi-controller/disk initialisation ipr: 00000160: 00000000 18000040 0000031F 00060000 ipr: 00000170: 220218D1 0000C49B 0000CECE 00000000 scsi 0:255:255:255: No Device IBM 2B4C001SISIOA 0150 PQ: 0 ANSI: 0 scsi 0:0:5:0: Direct-Access IBM ST9300653SS 740D PQ: 0 ANSI: 6 scsi 0:0:6:0: Direct-Access IBM ST9300653SS 740D PQ: 0 ANSI: 6 scsi 0:0:7:0: Direct-Access IBM ST9300653SS 740D PQ: 0 ANSI: 6 scsi 0:0:8:0: Direct-Access IBM ST9300653SS 740D PQ: 0 ANSI: 6 scsi 0:0:9:0: Direct-Access IBM ST9300653SS 740D PQ: 0 ANSI: 6 scsi 0:255:0:0: Direct-Access IBM IPR-0 112D8FAF PQ: 0 ANSI: 3 NMI watchdog: BUG: soft lockup - CPU#8 stuck for 22s! [irq/32-opal-elo:397] Modules linked in: ipr(+) libata tg3 ptp pps_core libphy scsi_mod agpgart drbg ansi_cprng CPU: 8 PID: 397 Comm: irq/32-opal-elo Not tainted 4.7.0-2-default #1 task: c000000f2cda3780 ti: c000000f2de30000 task.ti: c000000f2de30000 NIP: c0000000000104c4 LR: c0000000000104c4 CTR: 0000000030041c0c REGS: c000000f2de33830 TRAP: 0901 Not tainted (4.7.0-2-default) MSR: 900000000280b032 <SF,HV,VEC,VSX,EE,FP,ME,IR,DR,RI> CR: 42000844 XER: 00000000 CFAR: c00000000023c480 SOFTE: 1 GPR00: c000000000010488 c000000f2de33ab0 c000000001320c00 0000000000000900 GPR04: c000000001178368 0000000000000008 900000000280b032 0000000000000000 GPR08: 0000000000000003 0000000000000000 0000000000000000 0000000000000000 GPR12: c00000000008728c c00000000fb84800 NIP [c0000000000104c4] .arch_local_irq_restore+0x74/0x90 LR [c0000000000104c4] .arch_local_irq_restore+0x74/0x90 Call Trace: [c000000f2de33ab0] [c000000000010488] .arch_local_irq_restore+0x38/0x90 (unreliable) [c000000f2de33b20] [c000000000166c04] .irq_finalize_oneshot.part.2+0xb4/0x250 [c000000f2de33bc0] [c000000000166edc] .irq_thread_fn+0x7c/0xa0 [c000000f2de33c50] [c000000000167390] .irq_thread+0x1b0/0x270 [c000000f2de33d30] [c00000000011ba4c] .kthread+0x10c/0x130 [c000000f2de33e30] [c00000000000966c] .ret_from_kernel_thread+0x58/0x6c Instruction dump: 409e002c e92d0020 61298000 7d210164 38210070 e8010010 7c0803a6 4e800020 60000000 60000000 60000000 4bff1e0d <60000000> 4bffffdc 60000000 e92d0020 BUG: workqueue lockup - pool cpus=8 node=0 flags=0x0 nice=0 stuck for 51s! Showing busy workqueues and worker pools: workqueue events: flags=0x0 pwq 16: cpus=8 node=0 flags=0x0 nice=0 active=3/256 pending: .cache_reap, .push_to_pool, .push_to_pool pwq 0: cpus=0 node=0 flags=0x0 nice=0 active=1/256 in-flight: 509:.ipr_worker_thread [ipr] workqueue vmstat: flags=0xc pwq 16: cpus=8 node=0 flags=0x0 nice=0 active=1/256 pending: .vmstat_update pool 0: cpus=0 node=0 flags=0x0 nice=0 hung=1s workers=3 idle: 684 4 systemd-udevd[650]: seq 829 '/devices/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:06.0/0000:60:00.0' is taking a long time systemd-udevd[650]: seq 824 '/devices/pci0000:00/0000:00:00.0/0000:01:00.0/0000:02:05.0/0000:40:00.0' is taking a long time IPv6: ADDRCONF(NETDEV_UP): enP3p1s0f0: link is not ready INFO: rcu_sched self-detected stall on CPU 8-...: (6000 ticks this GP) idle=36d/140000000000001/0 softirq=162/162 fqs=4053 (t=6000 jiffies g=24 c=23 q=41364) Task dump for CPU 8: irq/32-opal-elo R running task 0 397 2 0x00000804 Call Trace: [c000000f2de330d0] [c00000000012f7d8] .sched_show_task+0xd8/0x180 (unreliable) [c000000f2de33150] [c000000000937d50] .rcu_dump_cpu_stacks+0xbc/0x104 [c000000f2de331f0] [c00000000017976c] .rcu_check_callbacks+0xa3c/0xaf0 [c000000f2de33320] [c000000000180c30] .update_process_times+0x50/0xa0 [c000000f2de333a0] [c000000000197c90] .tick_sched_handle.isra.6+0x40/0xd0 [c000000f2de33430] [c000000000197d84] .tick_sched_timer+0x64/0xd0 [c000000f2de334d0] [c000000000181798] .__hrtimer_run_queues+0x128/0x430 [c000000f2de335b0] [c0000000001826f8] .hrtimer_interrupt+0xf8/0x330 [c000000f2de336a0] [c00000000001f034] .__timer_interrupt+0x94/0x270 [c000000f2de33740] [c00000000001f3a0] .timer_interrupt+0x90/0xd0 [c000000f2de337c0] [c000000000002824] decrementer_common+0x124/0x180 --- interrupt: 901 at .arch_local_irq_restore+0x74/0x90 LR = .arch_local_irq_restore+0x74/0x90 [c000000f2de33ab0] [c000000000010488] .arch_local_irq_restore+0x38/0x90 (unreliable) [c000000f2de33b20] [c000000000166c04] .irq_finalize_oneshot.part.2+0xb4/0x250 [c000000f2de33bc0] [c000000000166edc] .irq_thread_fn+0x7c/0xa0 [c000000f2de33c50] [c000000000167390] .irq_thread+0x1b0/0x270 [c000000f2de33d30] [c00000000011ba4c] .kthread+0x10c/0x130 [c000000f2de33e30] [c00000000000966c] .ret_from_kernel_thread+0x58/0x6c tg3 0003:01:00.0 enP3p1s0f0: Link is up at 1000 Mbps, full duplex tg3 0003:01:00.0 enP3p1s0f0: Flow control is off for TX and off for RX tg3 0003:01:00.0 enP3p1s0f0: EEE is enabled