Bug ID 1214988
Summary btrfs-cleaner hangs and: watchdog: BUG: soft lockup - CPU#8 stuck for
Classification openSUSE
Product openSUSE Distribution
Version Leap 15.5
Hardware x86-64
OS openSUSE Leap 15.5
Status NEW
Severity Normal
Priority P5 - None
Component Basesystem
Assignee screening-team-bugs@suse.de
Reporter diego.ercolani@gmail.com
QA Contact qa-bugs@suse.de
Target Milestone ---
Found By ---
Blocker ---

In my system I use btrfs and sometimes I have this entry in journal:
Sep 05 09:52:53 pc-diego kernel: watchdog: BUG: soft lockup - CPU#8 stuck for
26s! [btrfs-cleaner:2833]
Sep 05 09:52:53 pc-diego kernel: Modules linked in: rpcsec_gss_krb5 nfsv4
dns_resolver nfs fscache netfs rfcomm tcp_diag inet_diag af_packet 8021q garp
m>
Sep 05 09:52:53 pc-diego kernel:  platform_profile syscopyarea rfkill
mdio_devres sysfillrect snd sysimgblt irqbypass video pcspkr efi_pstore(N)
wmi_bmof>
Sep 05 09:52:53 pc-diego kernel: Supported: No, Proprietary and Unsupported
modules are loaded
Sep 05 09:52:53 pc-diego kernel: CPU: 8 PID: 2833 Comm: btrfs-cleaner Tainted:
P           OE  X  N 5.14.21-150500.55.19-default #1 SLE15-SP5 a29285bac85>
Sep 05 09:52:53 pc-diego kernel: Hardware name: System manufacturer System
Product Name/PRIME B450M-A, BIOS 3002 03/10/2021
Sep 05 09:52:53 pc-diego kernel: RIP:
0010:generic_bin_search.constprop.31+0xc4/0x190 [btrfs]
Sep 05 09:52:53 pc-diego kernel: Code: 92 52 36 e9 48 c1 ff 06 48 c1 e7 0c 48
01 f8 49 8b 7d 00 48 01 d7 81 e7 ff 0f 00 00 48 8d 3c 38 4c 89 fe e8 9e f7 >
Sep 05 09:52:53 pc-diego kernel: RSP: 0018:ffffb672c8063b00 EFLAGS: 00000286
Sep 05 09:52:53 pc-diego kernel: RAX: 00000000ffffffff RBX: 0000000000000033
RCX: 000000000000006c
Sep 05 09:52:53 pc-diego kernel: RDX: 00000000000009f6 RSI: ffffb672c8063bef
RDI: ffff99a90581b560
Sep 05 09:52:53 pc-diego kernel: RBP: 0000000000000036 R08: ffffb672c8063b98
R09: 0000000000000000
Sep 05 09:52:53 pc-diego kernel: R10: 0000000000000001 R11: ffff99a850d23e78
R12: 0000000000000031
Sep 05 09:52:53 pc-diego kernel: R13: ffff99a850d23e00 R14: 0000000000000019
R15: ffffb672c8063bef
Sep 05 09:52:53 pc-diego kernel: FS:  0000000000000000(0000)
GS:ffff99b2bec00000(0000) knlGS:0000000000000000
Sep 05 09:52:53 pc-diego kernel: CS:  0010 DS: 0000 ES: 0000 CR0:
0000000080050033
Sep 05 09:52:53 pc-diego kernel: CR2: 00007f6ae7b3c000 CR3: 0000000ba9a10000
CR4: 00000000003506e0
Sep 05 09:52:53 pc-diego kernel: Call Trace:
Sep 05 09:52:53 pc-diego kernel:  <TASK>
Sep 05 09:52:53 pc-diego kernel:  btrfs_search_slot+0x400/0x920 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  btrfs_lookup_file_extent+0x4a/0x70 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  btrfs_get_extent+0x141/0x870 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? __x86_return_thunk+0x5/0x6
Sep 05 09:52:53 pc-diego kernel:  ? lock_extent_bits+0x4a/0xa0 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  defrag_lookup_extent+0xcf/0x120 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  btrfs_defrag_file+0x90a/0x1230 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? __x86_return_thunk+0x5/0x6
Sep 05 09:52:53 pc-diego kernel:  ? __x86_return_thunk+0x5/0x6
Sep 05 09:52:53 pc-diego kernel:  ? btrfs_iget_path+0x67/0x700 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? btrfs_get_root_ref+0x18d/0x310 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? btrfs_run_defrag_inodes+0x27c/0x360 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? __x86_return_thunk+0x5/0x6
Sep 05 09:52:53 pc-diego kernel:  btrfs_run_defrag_inodes+0x221/0x360 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  cleaner_kthread+0xec/0x130 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  ? csum_one_extent_buffer+0x110/0x110 [btrfs
0bea397a25b504ffddb94375cf1ffcaec85fa26b]
Sep 05 09:52:53 pc-diego kernel:  kthread+0x156/0x180
Sep 05 09:52:53 pc-diego kernel:  ? set_kthread_struct+0x50/0x50
Sep 05 09:52:53 pc-diego kernel:  ret_from_fork+0x22/0x30
Sep 05 09:52:53 pc-diego kernel:  </TASK>

I think the problem is related on some issue on a btrfs filesystem but I don't
know on wich as every "btrfs scrub" finish correctly:
pc-diego:~ # mount | grep btrfs
/dev/mapper/raid-rootfs on / type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=256,subvol=/@)
/dev/mapper/raid-usr on /usr type btrfs
(rw,relatime,compress=zstd:9,space_cache,autodefrag,subvolid=5,subvol=/)
/dev/mapper/raid-usr on /usr/local type btrfs
(rw,relatime,compress=zstd:9,space_cache,autodefrag,subvolid=257,subvol=/local)
/dev/mapper/raid-rootfs on /root type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=260,subvol=/@/root)
/dev/mapper/raid-rootfs on /srv type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=259,subvol=/@/srv)
/dev/mapper/raid-rootfs on /var type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=257,subvol=/@/var)
/dev/mapper/raid-boot on /boot type btrfs
(rw,relatime,space_cache,subvolid=5,subvol=/)
/dev/sdc1 on /M2 type btrfs
(rw,relatime,compress=zlib:3,ssd,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-var_lib_systemd_coredump on /var/lib/systemd/coredump type
btrfs (rw,relatime,compress=zlib:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-openwrt on /opt/openwrt type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-OVA on /dati/OVA type btrfs
(rw,relatime,compress=zlib:9,space_cache,subvolid=5,subvol=/)
/dev/mapper/raid-ISO on /dati/ISO type btrfs
(rw,relatime,compress=zlib:3,space_cache,autodefrag,subvolid=5,subvol=/)
/dev/mapper/nonraid-akonadi_data on /home/SSISNET/diego/.local/share/akonadi
type btrfs (rw,noatime,compress=zstd:9,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-var_tmp on /var/tmp type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-NextCloud on /home/SSISNET/diego/Nextcloud type btrfs
(rw,noatime,compress=zlib:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/raid-virtualmachines on /dati/virtualmachines type btrfs
(rw,relatime,compress=zstd:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-tmp on /tmp type btrfs
(rw,relatime,compress=zlib:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/raid-samba on /dati/samba type btrfs
(rw,relatime,compress=zlib:3,space_cache,autodefrag,subvolid=5,subvol=/)
/dev/mapper/nonraid-baloo on /home/SSISNET/diego/.local/share/baloo type btrfs
(rw,relatime,compress=zstd:3,space_cache,subvolid=5,subvol=/)
/dev/mapper/nonraid-var_log on /var/log type btrfs
(rw,relatime,compress=zstd:3,space_cache=v2,autodefrag,subvolid=5,subvol=/)
/dev/mapper/nonraid-var_cache on /var/cache type btrfs
(rw,relatime,compress=zstd:3,space_cache,subvolid=5,subvol=/)
/dev/bcache0 on /dati/virtualmachines/nonraid/bcache type btrfs
(rw,noatime,compress=zlib:3,ssd,space_cache,subvolid=5,subvol=/)


the processlist is full of btrfs-cleaner but I don't know how to undestand
which partition raise the problem:
pc-diego:~ # ps axuww | grep btrfs-cleaner
root      1049  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      1104  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      1251  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      1554  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2178  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2283  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2371  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2479  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2538  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2638  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2696  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2756  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2779  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2797  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2815  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2833  4.0  0.0      0     0 ?        S    09:46   4:09
[btrfs-cleaner]
root      2852  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]
root      2901  0.0  0.0      0     0 ?        S    09:46   0:00
[btrfs-cleaner]

ah, by the way, qcroups are off:
pc-diego:~ # snapper get-config | grep QGROUP
QGROUP                 |      

pc-diego:~ # mount | grep btrfs | cut -f 3 -d " " | while read a; do btrfs
qgroup show $a; done
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled
ERROR: can't list qgroups: quotas not enabled


You are receiving this mail because: