[Bug 1081431] New: qemu: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:238]
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 Bug ID: 1081431 Summary: qemu: NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:238] Classification: openSUSE Product: openSUSE Distribution Version: Leap 42.3 Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: Virtualization:Tools Assignee: virt-bugs@suse.de Reporter: matwey.kornilov@gmail.com QA Contact: qa-bugs@suse.de Found By: --- Blocker: --- Hello, I am running private opensuse build service instance. After upgrading to Leap 42.3 I see the following breaking issue with qemu-2.9.1. The guest kernel is stuck at boot with the following message. [ 28.100134] NMI watchdog: BUG: soft lockup - CPU#0 stuck for 23s! [modprobe:238] The same setup worked well with Leap 42.2 (qemu-2.6.2) qemu command line is the following: ./x86_64-softmmu/qemu-system-x86_64 -enable-kvm -nodefaults -no-reboot -nographic -vga none -object rng-random,filename=/dev/random,id=rng0 -device virtio-rng-pci,rng=rng0 -runas qemu -cpu host -net none -cpu host,-tsc-deadline,pmu=off -kernel /var/cache/obs/worker/root_1/.mount/boot/kernel -initrd /var/cache/obs/worker/root_1/.mount/boot/initrd -append 'root=/dev/disk/by-id/virtio-0 rootfstype=ext4 rootflags=noatime panic=1 quiet no-kvmclock nmi_watchdog=0 rw rd.driver.pre=binfmt_misc elevator=noop console=ttyS0 init=/.build/build' -m 2048 -drive file=/var/cache/obs/worker/root_1/root,format=raw,if=none,id=disk,serial=0,cache=unsafe -device virtio-blk-pci,drive=disk -drive file=/var/cache/obs/worker/root_1/swap,format=raw,if=none,id=swap,serial=1,cache=unsafe -device virtio-blk-pci,drive=swap -serial stdio -smp 2 I've bisected first bad commit: 9a4c0e220d8a ("hw/virtio-pci: fix virtio behaviour") -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 http://bugzilla.opensuse.org/show_bug.cgi?id=1081431#c1 --- Comment #1 from Matwey Kornilov <matwey.kornilov@gmail.com> --- Appending disable-modern to -device virtio-blk-pci fixes the issue for unknown reason. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 http://bugzilla.opensuse.org/show_bug.cgi?id=1081431#c2 --- Comment #2 from Matwey Kornilov <matwey.kornilov@gmail.com> --- Example for lockup stacktrace: [ 28.268091] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [modprobe:239] [ 28.268682] Modules linked in: virtio_blk%2 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 http://bugzilla.opensuse.org/show_bug.cgi?id=1081431#c3 --- Comment #3 from Matwey Kornilov <matwey.kornilov@gmail.com> --- Example for lockup trace: [ 28.268091] NMI watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [modprobe:239] [ 28.268682] Modules linked in: virtio_blk(+) virtio_mmio virtio_pci virtio_ring virtio nf_conntrack_ipv6 nf_defrag_ipv6 nf_conntrack btrfs xor raid6_pq reiserfs ext4 crc16 jbd2 mbcache squashfs fuse dm_snapshot dm_bufio dm_mod binfmt_misc loop sg scsi_mod autofs4 [ 28.284605] CPU: 1 PID: 239 Comm: modprobe Not tainted 4.4.76-1-default #1 [ 28.284605] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 [ 28.296206] task: ffff88007a2244c0 ti: ffff88007a208000 task.ti: ffff88007a208000 [ 28.300306] RIP: 0010:[<ffffffff813531bc>] [<ffffffff813531bc>] iowrite16+0x2c/0x30 [ 28.304250] RSP: 0018:ffff88007a20b428 EFLAGS: 00010292 [ 28.308256] RAX: ffff88007a1988c0 RBX: ffff880079ef7000 RCX: ffff88007a1b2940 [ 28.312235] RDX: ffffc900003c6000 RSI: ffffc900003c6000 RDI: 0000000000000000 [ 28.316214] RBP: ffff88007a0d5ac0 R08: ffff88007a36b2f8 R09: 0000000001080020 [ 28.320199] R10: 000000000001d550 R11: ffff88007a27d888 R12: ffff88007a20b4a8 [ 28.320199] R13: 0000000000000000 R14: ffff88007a14ec00 R15: 0000000000000282 [ 28.324210] FS: 00007fa9415e1700(0000) GS:ffff88007f700000(0000) knlGS:0000000000000000 [ 28.332211] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 28.332211] CR2: 0000000000b52078 CR3: 000000007a14f000 CR4: 00000000000006e0 [ 28.336219] Stack: [ 28.340186] ffffffffa01e88e2 ffffffffa0118082 0000000000000000 ffffffffa03786f3 [ 28.344215] ffff88007a36b2f8 0000000100000000 ffff88007a36b180 0000000000000000 [ 28.348195] ffff88007a14ec00 ffff88007a27d888 ffff88007a14ec90 ffff88007a20b4f8 [ 28.352207] Call Trace: [ 28.352207] [<ffffffffa01e88e2>] vp_notify+0x12/0x20 [virtio_pci] [ 28.356200] [<ffffffffa0118082>] virtqueue_notify+0x12/0x30 [virtio_ring] [ 28.360200] [<ffffffffa03786f3>] virtio_queue_rq+0x233/0x270 [virtio_blk] [ 28.364237] [<ffffffff81313d8f>] blk_mq_dispatch_rq_list+0xcf/0x1e0 [ 28.368218] [<ffffffff81313fc3>] blk_mq_process_rq_list+0x123/0x140 [ 28.372185] [<ffffffff8131407a>] __blk_mq_run_hw_queue+0x9a/0xb0 [ 28.376179] [<ffffffff81314169>] __blk_mq_delay_run_hw_queue+0xd9/0xf0 [ 28.380203] [<ffffffff81315117>] blk_sq_make_request+0x367/0x4a0 [ 28.384212] [<ffffffff81308048>] generic_make_request+0xf8/0x2b0 [ 28.388215] [<ffffffff8130826e>] submit_bio+0x6e/0x140 [ 28.388215] [<ffffffff8124677d>] submit_bh_wbc+0x12d/0x160 [ 28.392218] [<ffffffff81246a9d>] block_read_full_page+0x1dd/0x310 [ 28.396200] [<ffffffff81196738>] do_read_cache_page+0x108/0x1b0 [ 28.400219] [<ffffffff8131c8d9>] read_dev_sector+0x79/0x90 [ 28.404174] [<ffffffff81322d2a>] read_lba+0xda/0x170 [ 28.404174] [<ffffffff8132332a>] find_valid_gpt+0xfa/0x710 [ 28.408210] [<ffffffff813239c9>] efi_partition+0x89/0x440 [ 28.412213] [<ffffffff8131e8b8>] check_partition+0xf8/0x1e0 [ 28.416211] [<ffffffff8131d196>] rescan_partitions+0xb6/0x380 [ 28.416211] [<ffffffff8124b1c3>] __blkdev_get+0x373/0x4b0 [ 28.420178] [<ffffffff8124b4d0>] blkdev_get+0x1d0/0x320 [ 28.424216] [<ffffffff8131a92c>] device_add_disk+0x38c/0x4a0 [ 28.428209] [<ffffffffa03792f5>] virtblk_probe+0x465/0x7ad [virtio_blk] [ 28.432211] [<ffffffffa00de698>] virtio_dev_probe+0x138/0x210 [virtio] [ 28.436223] [<ffffffff814884b7>] driver_probe_device+0x1f7/0x420 [ 28.436223] [<ffffffff8148875b>] __driver_attach+0x7b/0x80 [ 28.440208] [<ffffffff814863a8>] bus_for_each_dev+0x58/0x90 [ 28.444204] [<ffffffff814878f9>] bus_add_driver+0x1c9/0x280 [ 28.448217] [<ffffffff8148911b>] driver_register+0x5b/0xd0 [ 28.452215] [<ffffffffa037e04a>] init+0x4a/0x1000 [virtio_blk] [ 28.456214] [<ffffffff81002138>] do_one_initcall+0xc8/0x1f0 [ 28.456214] [<ffffffff8119430d>] do_init_module+0x5a/0x1d7 [ 28.460206] [<ffffffff811136a7>] load_module+0x1e97/0x2800 [ 28.464210] [<ffffffff811141e0>] SYSC_finit_module+0x70/0xa0 [ 28.468217] [<ffffffff816303b2>] entry_SYSCALL_64_fastpath+0x16/0x71 [ 28.472213] DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x16/0x71 [ 28.476236] [ 28.476236] Leftover inexact backtrace: [ 28.476236] [ 28.480239] Code: 81 fe ff ff 03 00 48 89 f2 77 20 48 81 fe 00 00 01 00 76 08 0f b7 d6 89 f8 66 ef c3 48 c7 c6 5b d3 a4 81 48 89 d7 e9 b4 fe ff ff <66> 89 3e c3 48 81 fe ff ff 03 00 48 89 f2 77 1f 48 81 fe 00 00 [ 28.496194] Kernel panic - not syncing: softlockup: hung tasks [ 28.496194] CPU: 1 PID: 239 Comm: modprobe Tainted: G L 4.4.76-1-default #1 [ 28.504212] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.9.3-0-ge2fc41e-prebuilt.qemu-project.org 04/01/2014 [ 28.508181] 0000000000000000 ffffffff81339d97 ffffffff81a2727f ffff88007f703ed0 [ 28.512196] ffffffff811933d0 0000000000000008 ffff88007f703ee0 ffff88007f703e80 [ 28.516186] ffffffff82237e60 0000000000000000 0000000000000000 0000000000000006 [ 28.524205] Call Trace: [ 28.524205] [<ffffffff81019f29>] dump_trace+0x59/0x320 [ 28.528176] [<ffffffff8101a2ea>] show_stack_log_lvl+0xfa/0x180 [ 28.528176] [<ffffffff8101b091>] show_stack+0x21/0x40 [ 28.532181] [<ffffffff81339d97>] dump_stack+0x5c/0x85 [ 28.536215] [<ffffffff811933d0>] panic+0xd2/0x219 [ 28.540220] [<ffffffff81146099>] watchdog_timer_fn+0x1d9/0x1e0 [ 28.540220] [<ffffffff810f9ca1>] __hrtimer_run_queues+0xf1/0x290 [ 28.544201] [<ffffffff810fa119>] hrtimer_interrupt+0x99/0x1a0 [ 28.548200] [<ffffffff81633109>] smp_apic_timer_interrupt+0x39/0x50 [ 28.552209] [<ffffffff816311dc>] apic_timer_interrupt+0x8c/0xa0 [ 28.556176] DWARF2 unwinder stuck at apic_timer_interrupt+0x8c/0xa0 [ 28.560212] [ 28.560212] Leftover inexact backtrace: [ 28.560212] [ 28.564181] <IRQ> <EOI> [<ffffffff813531bc>] ? iowrite16+0x2c/0x30 [ 28.568180] [<ffffffffa01e88e2>] ? vp_notify+0x12/0x20 [virtio_pci] [ 28.572199] [<ffffffffa0118082>] ? virtqueue_notify+0x12/0x30 [virtio_ring] [ 28.576216] [<ffffffffa03786f3>] ? virtio_queue_rq+0x233/0x270 [virtio_blk] [ 28.580211] [<ffffffff81313d8f>] ? blk_mq_dispatch_rq_list+0xcf/0x1e0 [ 28.584210] [<ffffffff81313fc3>] ? blk_mq_process_rq_list+0x123/0x140 [ 28.588178] [<ffffffff8131407a>] ? __blk_mq_run_hw_queue+0x9a/0xb0 [ 28.592209] [<ffffffff81314169>] ? __blk_mq_delay_run_hw_queue+0xd9/0xf0 [ 28.596179] [<ffffffff81315117>] ? blk_sq_make_request+0x367/0x4a0 [ 28.600210] [<ffffffff81308048>] ? generic_make_request+0xf8/0x2b0 [ 28.600210] [<ffffffff8130826e>] ? submit_bio+0x6e/0x140 [ 28.604199] [<ffffffff812ffc2e>] ? bio_alloc_bioset+0x16e/0x2a0 [ 28.608216] [<ffffffff811ef2fc>] ? cache_grow+0x17c/0x230 [ 28.612210] [<ffffffff8124677d>] ? submit_bh_wbc+0x12d/0x160 [ 28.616219] [<ffffffff81246a9d>] ? block_read_full_page+0x1dd/0x310 [ 28.616219] [<ffffffff81248c40>] ? I_BDEV+0x10/0x10 [ 28.620187] [<ffffffff81196238>] ? __add_to_page_cache_locked+0x128/0x220 [ 28.624203] [<ffffffff81249500>] ? blkdev_writepages+0x30/0x30 [ 28.628206] [<ffffffff81249500>] ? blkdev_writepages+0x30/0x30 [ 28.632233] [<ffffffff81196738>] ? do_read_cache_page+0x108/0x1b0 [ 28.636214] [<ffffffff8131c8d9>] ? read_dev_sector+0x79/0x90 [ 28.640179] [<ffffffff81322d2a>] ? read_lba+0xda/0x170 [ 28.640179] [<ffffffff8132332a>] ? find_valid_gpt+0xfa/0x710 [ 28.644212] [<ffffffff8119d69b>] ? __alloc_pages_nodemask+0x10b/0xbc0 [ 28.648203] [<ffffffff813239c9>] ? efi_partition+0x89/0x440 [ 28.652236] [<ffffffff81343d46>] ? string.isra.4+0x36/0xd0 [ 28.660195] [<ffffffff81345d69>] ? snprintf+0x39/0x40 [ 28.664199] [<ffffffff81323940>] ? find_valid_gpt+0x710/0x710 [ 28.669305] [<ffffffff8131e8b8>] ? check_partition+0xf8/0x1e0 [ 28.676181] [<ffffffff8131d196>] ? rescan_partitions+0xb6/0x380 [ 28.680659] [<ffffffff8162db9c>] ? mutex_lock+0x1c/0x38 [ 28.684224] [<ffffffff8124b1c3>] ? __blkdev_get+0x373/0x4b0 [ 28.692214] [<ffffffff8124b4d0>] ? blkdev_get+0x1d0/0x320 [ 28.700173] [<ffffffff8122a29e>] ? unlock_new_inode+0x4e/0x80 [ 28.704167] [<ffffffff8124a7b2>] ? bdget+0x122/0x140 [ 28.712227] [<ffffffff8131a92c>] ? device_add_disk+0x38c/0x4a0 [ 28.716190] [<ffffffff813530de>] ? ioread8+0x2e/0x40 [ 28.724207] [<ffffffffa03792f5>] ? virtblk_probe+0x465/0x7ad [virtio_blk] [ 28.732195] [<ffffffffa00de698>] ? virtio_dev_probe+0x138/0x210 [virtio] [ 28.736193] [<ffffffff814884b7>] ? driver_probe_device+0x1f7/0x420 [ 28.744178] [<ffffffff8148875b>] ? __driver_attach+0x7b/0x80 [ 28.748172] [<ffffffff814886e0>] ? driver_probe_device+0x420/0x420 [ 28.756215] [<ffffffff814863a8>] ? bus_for_each_dev+0x58/0x90 [ 28.764172] [<ffffffff814878f9>] ? bus_add_driver+0x1c9/0x280 [ 28.768167] [<ffffffffa037e000>] ? 0xffffffffa037e000 [ 28.776166] [<ffffffff8148911b>] ? driver_register+0x5b/0xd0 [ 28.784297] [<ffffffffa037e04a>] ? init+0x4a/0x1000 [virtio_blk] [ 28.792183] [<ffffffff81002138>] ? do_one_initcall+0xc8/0x1f0 [ 28.796171] [<ffffffff8119430d>] ? do_init_module+0x5a/0x1d7 [ 28.804166] [<ffffffff811136a7>] ? load_module+0x1e97/0x2800 [ 28.808172] [<ffffffff8110ff40>] ? __symbol_put+0x40/0x40 [ 28.816169] [<ffffffff811141e0>] ? SYSC_finit_module+0x70/0xa0 [ 28.820180] [<ffffffff816303b2>] ? entry_SYSCALL_64_fastpath+0x16/0x71 -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 http://bugzilla.opensuse.org/show_bug.cgi?id=1081431#c4 --- Comment #4 from Matwey Kornilov <matwey.kornilov@gmail.com> --- Broad range of guest kernels shows same lockup: from 4.4 to 4.15. I have not tested earlier kernels. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1081431 http://bugzilla.opensuse.org/show_bug.cgi?id=1081431#c5 --- Comment #5 from Matwey Kornilov <matwey.kornilov@gmail.com> --- It seems that applying d391f1207067 ("x86/kvm/vmx: do not use vm-exit instruction length for fast MMIO when running nested") to the host kernel fixes the issue. -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@novell.com