[Bug 1190482] New: kernel BUG at mm/slub.c creates stale workers
http://bugzilla.opensuse.org/show_bug.cgi?id=1190482 Bug ID: 1190482 Summary: kernel BUG at mm/slub.c creates stale workers Classification: openSUSE Product: openSUSE.org Version: unspecified Hardware: Other OS: Other Status: NEW Severity: Normal Priority: P5 - None Component: BuildService Assignee: screening-team-bugs@suse.de Reporter: code@bnavigator.de QA Contact: adrian.schroeter@suse.com Found By: --- Blocker: --- Starting last week, many (but not all) builds of spyder from openSUSE:Factory/spyder, devel:languages:python:numeric/spyder and branched projects became stale for days because of kernel crashes. The following code is in spyder.spec: %check ... function testspyder() { xvfb-run --server-args "-screen 0 1920x1080x24" python3 runtests.py -m "not no_xvfb" --timeout 1800 -ra -k "not (${donttest:4})" $@ # wait a bit until we can start the next xvfb sleep 5 } testspyder testspyder --run-slow Relevant buildlogs: x86_64 and i596: [ 928s] = 921 passed, 91 skipped, 176 deselected, 2 xfailed, 26 warnings in 862.01s (0:14:22) = [ 931s] + sleep 5 [ 936s] [ 927.261691][ T3247] kernel BUG at mm/slub.c:321! [ 936s] [ 927.263137][ T3247] invalid opcode: 0000 [#1] SMP NOPTI [ 936s] [ 927.264717][ T3247] CPU: 2 PID: 3247 Comm: sh Not tainted 5.14.1-1-default #1 openSUSE Tumbleweed 77bbc82e23666d88b5be1f7477a6fc9946523f12 [ 936s] [ 927.265564][ T3247] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a-rebuilt.opensuse.org 04/01/2014 [ 936s] [ 927.265564][ T3247] RIP: 0010:__slab_free+0x22d/0x420 [ 936s] [ 927.265564][ T3247] Code: 00 44 8b 44 24 14 44 0f b6 54 24 26 48 8b 54 24 18 8b 74 24 20 48 89 44 24 08 44 0f b6 4c 24 27 48 8b 7c 24 28 e9 8d fe ff ff <0f> 0b 49 3b 54 24 28 0f 85 6b ff ff ff 49 89 5c 24 20 49 89 4c 24 [ 936s] [ 927.265564][ T3247] RSP: 0018:ffffab1244adbb80 EFLAGS: 00010046 [ 936s] [ 927.265564][ T3247] RAX: ffff9b84c3528c60 RBX: ffff9b84c3528c00 RCX: ffff9b84c3528c00 [ 936s] [ 927.265564][ T3247] RDX: 0000000080150005 RSI: fffff7fd840d4a00 RDI: ffff9b84c0042800 [ 936s] [ 927.265564][ T3247] RBP: ffffab1244adbc30 R08: 0000000000000001 R09: ffffffff8cac69d5 [ 936s] [ 927.265564][ T3247] R10: ffffab1244adbcd7 R11: 0000000000000000 R12: fffff7fd840d4a00 [ 936s] [ 927.265564][ T3247] R13: ffff9b84c3528c00 R14: ffff9b84c0042800 R15: ffff9b84c3528c00 [ 936s] [ 927.265564][ T3247] FS: 00007fe60dabbc00(0000) GS:ffff9b85f7c80000(0000) knlGS:0000000000000000 [ 936s] [ 927.265564][ T3247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 936s] [ 927.265564][ T3247] CR2: 000055c3d786eb0c CR3: 00000001035b2000 CR4: 00000000003506e0 [ 936s] [ 927.265564][ T3247] Call Trace: [ 936s] [ 927.265564][ T3247] ? put_ucounts+0x75/0x90 [ 936s] [ 927.265564][ T3247] kfree+0x352/0x3c0 [ 936s] [ 927.265564][ T3247] put_ucounts+0x75/0x90 [ 936s] [ 927.265564][ T3247] __sigqueue_free.part.0+0x3e/0x60 [ 936s] [ 927.265564][ T3247] dequeue_signal+0x12a/0x1f0 [ 936s] [ 927.265564][ T3247] get_signal+0x206/0x8b0 [ 936s] [ 927.265564][ T3247] ? kmem_cache_free+0x1d0/0x3e0 [ 936s] [ 927.265564][ T3247] ? call_rcu+0xdd/0x7d0 [ 936s] [ 927.265564][ T3247] arch_do_signal_or_restart+0xfd/0x730 [ 936s] [ 927.265564][ T3247] ? do_sigaction+0x116/0x280 [ 936s] [ 927.265564][ T3247] ? queued_spin_unlock+0x5/0x10 [ 936s] [ 927.265564][ T3247] ? wp_page_reuse+0x61/0x70 [ 936s] [ 927.265564][ T3247] ? __handle_mm_fault+0xd66/0x1520 [ 936s] [ 927.265564][ T3247] exit_to_user_mode_prepare+0x12c/0x230 [ 936s] [ 927.265564][ T3247] syscall_exit_to_user_mode+0x18/0x40 [ 936s] [ 927.265564][ T3247] do_syscall_64+0x69/0x80 [ 936s] [ 927.265564][ T3247] ? handle_mm_fault+0xcf/0x2a0 [ 936s] [ 927.265564][ T3247] ? do_user_addr_fault+0x1d5/0x670 [ 936s] [ 927.265564][ T3247] ? do_syscall_64+0x69/0x80 [ 936s] [ 927.265564][ T3247] ? exc_page_fault+0x68/0x130 [ 936s] [ 927.265564][ T3247] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 936s] [ 927.265564][ T3247] RIP: 0033:0x7fe60db73d8b [ 936s] [ 927.265564][ T3247] Code: 48 85 f6 74 15 48 b9 00 00 00 80 01 00 00 00 48 8b 06 48 85 c8 75 48 49 89 f0 41 ba 08 00 00 00 4c 89 c6 b8 0e 00 00 00 0f 05 <89> c2 f7 da 3d 00 f0 ff ff b8 00 00 00 00 0f 47 c2 48 8b 94 24 88 [ 936s] [ 927.265564][ T3247] RSP: 002b:00007ffe6e8c36d0 EFLAGS: 00000246 ORIG_RAX: 000000000000000e [ 936s] [ 927.265564][ T3247] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007fe60db73d8b [ 936s] [ 927.265564][ T3247] RDX: 0000000000000000 RSI: 00007ffe6e8c37c0 RDI: 0000000000000002 [ 936s] [ 927.265564][ T3247] RBP: 000055c3d787a810 R08: 00007ffe6e8c37c0 R09: 000055c3d7873480 [ 936s] [ 927.265564][ T3247] R10: 0000000000000008 R11: 0000000000000246 R12: 000055c3d787a810 [ 936s] [ 927.265564][ T3247] R13: 000055c3d787a810 R14: 0000000000000000 R15: 00007ffe6e8c37c0 [ 936s] [ 927.265564][ T3247] Modules linked in: ata_generic crc32_pclmul ata_piix qemu_fw_cfg overlay e1000 nls_iso8859_1 nls_cp437 vfat fat virtio_blk virtio_mmio xfs btrfs blake2b_generic xor raid6_pq libcrc32c reiserfs ext4 crc32c_intel mbcache jbd2 squashfs fuse dm_snapshot dm_bufio dm_crypt essiv authenc trusted asn1_encoder tee dm_mod binfmt_misc loop sg virtio_rng [ 936s] [ 927.265564][ T3247] ---[ end trace 62125b68cc9ddb14 ]--- [ 936s] [ 927.265564][ T3247] RIP: 0010:__slab_free+0x22d/0x420 [ 936s] [ 927.265564][ T3247] Code: 00 44 8b 44 24 14 44 0f b6 54 24 26 48 8b 54 24 18 8b 74 24 20 48 89 44 24 08 44 0f b6 4c 24 27 48 8b 7c 24 28 e9 8d fe ff ff <0f> 0b 49 3b 54 24 28 0f 85 6b ff ff ff 49 89 5c 24 20 49 89 4c 24 [ 936s] [ 927.265564][ T3247] RSP: 0018:ffffab1244adbb80 EFLAGS: 00010046 [ 936s] [ 927.265564][ T3247] RAX: ffff9b84c3528c60 RBX: ffff9b84c3528c00 RCX: ffff9b84c3528c00 [ 936s] [ 927.265564][ T3247] RDX: 0000000080150005 RSI: fffff7fd840d4a00 RDI: ffff9b84c0042800 [ 936s] [ 927.265564][ T3247] RBP: ffffab1244adbc30 R08: 0000000000000001 R09: ffffffff8cac69d5 [ 936s] [ 927.265564][ T3247] R10: ffffab1244adbcd7 R11: 0000000000000000 R12: fffff7fd840d4a00 [ 936s] [ 927.265564][ T3247] R13: ffff9b84c3528c00 R14: ffff9b84c0042800 R15: ffff9b84c3528c00 [ 936s] [ 927.265564][ T3247] FS: 00007fe60dabbc00(0000) GS:ffff9b85f7c80000(0000) knlGS:0000000000000000 [ 936s] [ 927.265564][ T3247] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 936s] [ 927.265564][ T3247] CR2: 000055c3d786eb0c CR3: 00000001035b2000 CR4: 00000000003506e0 [ 996s] [ 987.270511][ C5] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 996s] [ 987.272683][ C5] rcu: 2-...0: (2 GPs behind) idle=7be/1/0x4000000000000000 softirq=120217/120217 fqs=7500 [ 996s] [ 987.274486][ C5] (detected by 5, t=15003 jiffies, g=290705, q=559) [ 996s] [ 987.274486][ C5] Sending NMI from CPU 5 to CPUs 2: [ 996s] [ 987.274486][ C5] NMI backtrace for cpu 2 skipped: idling at native_halt+0xa/0x10 armv7l: ... [ 1059s] [ 1015.375123][ T3213] kernel BUG at mm/slub.c:321! [ 1059s] [ 1015.375420][ T3213] Internal error: Oops - BUG: 0 [#1] SMP .. The workers then chew on the crash for days. I cannot reproduce the behavior on a local osc build. Whatever is wrong with the specfile, Spyder or Xvfb, it should not be able to crash the kernel and block an obs worker for days. -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1190482 http://bugzilla.opensuse.org/show_bug.cgi?id=1190482#c1 --- Comment #1 from OBSbugzilla Bot <bwiedemann+obsbugzillabot@suse.com> --- This is an autogenerated message for OBS integration: This bug (1190482) was mentioned in https://build.opensuse.org/request/show/919871 Factory / spyder -- You are receiving this mail because: You are on the CC list for the bug.
http://bugzilla.opensuse.org/show_bug.cgi?id=1190482 http://bugzilla.opensuse.org/show_bug.cgi?id=1190482#c2 Benjamin Greiner <code@bnavigator.de> changed: What |Removed |Added ---------------------------------------------------------------------------- Status|NEW |RESOLVED Resolution|--- |WONTFIX --- Comment #2 from Benjamin Greiner <code@bnavigator.de> --- not reproducible anymore -- You are receiving this mail because: You are on the CC list for the bug.
participants (1)
-
bugzilla_noreply@suse.com