[opensuse-factory] Problem with BTRFS in openSUSE Tumbleweed: No space left on device

older
[opensuse-factory] openSUSE Leap...

Ronan Arraes Jardim Chagas

23 Aug 2016 23 Aug '16

19:39

Hi guys! I'm having a very strange problem with Tumbleweed and BTRFS. I have already reported the issue to upstream [1] that advised me to report to openSUSE. I have a workstation with BTRFS in root and I started to see on a daily basis: "No space left on device" Even if I have tons of unallocated space as it can be seen here: # btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 119.07GiB Device unallocated: 1.14TiB Device missing: 0.00B Used: 115.08GiB Free (estimated): 1.14TiB (min: 586.21GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 512.00MiB (used: 0.00B) Data,single: Size:113.01GiB, Used:111.19GiB /dev/sda6 113.01GiB Metadata,DUP: Size:3.00GiB, Used:1.94GiB /dev/sda6 6.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.14TiB I notice that the problem usually starts when I am building some package in a chroot using osc or when I open KVM (the KVM disk is in another EXT4 partition, so no COW problems). When it happens, sometimes btrfs balance works, sometimes not. When it does not work, my only solution is a reboot or delete files so that balance can be started. Since this is a production machine, I decided to format and install Tumbleweed using the latest available ISO. However, at the same day, when I started to build local packages using osc, the problem happened again. The behavior was completely strange. When the problem happened, I had: Metadata, DUP: total=2.00GiB, used=811.52MiB I could not use balance, but after a while it fixed itself and I saw: Metadata, DUP: total=9.50GiB, used=811.52MiB After some minutes, I saw again the same problem, which was also automatically fixed, but then I saw: Metadata,DUP: Size:33.50GiB, Used:812.78MiB I have no idea why Metadata allocation is being increased so much. During the last problem, I could see some messages in dmesg: Ago 22 16:00:03 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 9323937792 flags 34 Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:00:04 ronanarraes-osd kernel: BTRFS info (device sda6): 1 enospc errors during balance Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 36201037824 flags 34 Ago 22 16:00:24 ronanarraes-osd kernel: BTRFS info (device sda6): 2 enospc errors during balance Ago 22 16:00:45 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 36234592256 flags 34 Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:00:46 ronanarraes-osd kernel: BTRFS info (device sda6): 4 enospc errors during balance Ago 22 16:01:20 ronanarraes-osd kernel: BTRFS info (device sda6): relocating block group 38415630336 flags 34 Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): found 1 extents Ago 22 16:01:21 ronanarraes-osd kernel: BTRFS info (device sda6): 8 enospc errors during balance Do you have any idea what is happening? I have no clue what to do next. Regards, Ronan Arraes -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Show replies by date

Chris Murphy

23 Aug 23 Aug

22:49

On Tue, Aug 23, 2016 at 1:39 PM, Ronan Arraes Jardim Chagas wrote:

...

Hi guys!

I'm having a very strange problem with Tumbleweed and BTRFS. I have already reported the issue to upstream [1] that advised me to report to openSUSE. I have a workstation with BTRFS in root and I started to see on a daily basis:

I suggested Ronan come here to report it, because I've never seen this behavior reported by anyone, and I've been a linux-btrfs@ list regular for years. Either there's a unique use case that's triggering it, or maybe there's an unfound bug in 4.7.0 no one else has hit yet? I'm not sure, but I'm not seeing it with mainline or Fedora kernels. Questions for Ronan: What does 'cat /proc/mounts' for this file system show? Does the problem happen with older kernels? What (package) versions have you tried? In particular I'm curious if you've tried a 4.5.7 opensuse built kernel?

...

I notice that the problem usually starts when I am building some package in a chroot using osc

Can you provide the steps for setting up this chroot? The package being built will help, if the developer can exactly reproduce what you're doing. -- Chris Murphy -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Ronan Arraes Jardim Chagas

24 Aug 24 Aug

20:06

Hi! Em Ter, 2016-08-23 às 16:49 -0600, Chris Murphy escreveu:

...

Questions for Ronan:

What does 'cat /proc/mounts' for this file system show? Does the problem happen with older kernels? What (package) versions have you tried? In particular I'm curious if you've tried a 4.5.7 opensuse built kernel?

It is happening for a while, I can't assure when It started, but it definitely happened with kernel 4.6. $ cat /proc/mounts sysfs /sys sysfs rw,nosuid,nodev,noexec,relatime 0 0 proc /proc proc rw,nosuid,nodev,noexec,relatime 0 0 devtmpfs /dev devtmpfs rw,nosuid,size=32939968k,nr_inodes=8234992,mode=755 0 0 securityfs /sys/kernel/security securityfs rw,nosuid,nodev,noexec,relatime 0 0 tmpfs /dev/shm tmpfs rw,nosuid,nodev 0 0 devpts /dev/pts devpts rw,nosuid,noexec,relatime,gid=5,mode=620,ptmxmode=000 0 0 tmpfs /run tmpfs rw,nosuid,nodev,mode=755 0 0 tmpfs /sys/fs/cgroup tmpfs ro,nosuid,nodev,noexec,mode=755 0 0 cgroup /sys/fs/cgroup/systemd cgroup rw,nosuid,nodev,noexec,relatime,xattr,release_agent=/usr/lib/systemd/sy stemd-cgroups-agent,name=systemd 0 0 pstore /sys/fs/pstore pstore rw,nosuid,nodev,noexec,relatime 0 0 cgroup /sys/fs/cgroup/blkio cgroup rw,nosuid,nodev,noexec,relatime,blkio 0 0 cgroup /sys/fs/cgroup/memory cgroup rw,nosuid,nodev,noexec,relatime,memory 0 0 cgroup /sys/fs/cgroup/cpu,cpuacct cgroup rw,nosuid,nodev,noexec,relatime,cpu,cpuacct 0 0 cgroup /sys/fs/cgroup/net_cls,net_prio cgroup rw,nosuid,nodev,noexec,relatime,net_cls,net_prio 0 0 cgroup /sys/fs/cgroup/perf_event cgroup rw,nosuid,nodev,noexec,relatime,perf_event 0 0 cgroup /sys/fs/cgroup/freezer cgroup rw,nosuid,nodev,noexec,relatime,freezer 0 0 cgroup /sys/fs/cgroup/cpuset cgroup rw,nosuid,nodev,noexec,relatime,cpuset 0 0 cgroup /sys/fs/cgroup/devices cgroup rw,nosuid,nodev,noexec,relatime,devices 0 0 cgroup /sys/fs/cgroup/pids cgroup rw,nosuid,nodev,noexec,relatime,pids 0 0 cgroup /sys/fs/cgroup/hugetlb cgroup rw,nosuid,nodev,noexec,relatime,hugetlb 0 0 /dev/sda6 / btrfs rw,relatime,space_cache,subvolid=259,subvol=/@/.snapshots/1/snapshot 0 0 systemd-1 /proc/sys/fs/binfmt_misc autofs rw,relatime,fd=31,pgrp=1,timeout=0,minproto=5,maxproto=5,direct,pipe_in o=6332 0 0 debugfs /sys/kernel/debug debugfs rw,relatime 0 0 mqueue /dev/mqueue mqueue rw,relatime 0 0 hugetlbfs /dev/hugepages hugetlbfs rw,relatime 0 0 tmpfs /var/run tmpfs rw,nosuid,nodev,mode=755 0 0 /dev/sda6 /home btrfs rw,relatime,space_cache,subvolid=262,subvol=/@/home 0 0 /dev/sda6 /boot/grub2/x86_64-efi btrfs rw,relatime,space_cache,subvolid=261,subvol=/@/boot/grub2/x86_64-efi 0 0 /dev/sda6 /var/tmp btrfs rw,relatime,space_cache,subvolid=278,subvol=/@/var/tmp 0 0 /dev/sda6 /.snapshots btrfs rw,relatime,space_cache,subvolid=258,subvol=/@/.snapshots 0 0 /dev/sda6 /var/lib/mailman btrfs rw,relatime,space_cache,subvolid=270,subvol=/@/var/lib/mailman 0 0 /dev/sda6 /var/crash btrfs rw,relatime,space_cache,subvolid=268,subvol=/@/var/crash 0 0 /dev/sda6 /var/lib/named btrfs rw,relatime,space_cache,subvolid=273,subvol=/@/var/lib/named 0 0 /dev/sda6 /var/lib/libvirt/images btrfs rw,relatime,space_cache,subvolid=269,subvol=/@/var/lib/libvirt/images 0 0 /dev/sda6 /var/spool btrfs rw,relatime,space_cache,subvolid=277,subvol=/@/var/spool 0 0 /dev/sda6 /boot/grub2/i386-pc btrfs rw,relatime,space_cache,subvolid=260,subvol=/@/boot/grub2/i386-pc 0 0 /dev/sda6 /var/lib/pgsql btrfs rw,relatime,space_cache,subvolid=274,subvol=/@/var/lib/pgsql 0 0 /dev/sda6 /var/cache btrfs rw,relatime,space_cache,subvolid=267,subvol=/@/var/cache 0 0 /dev/sda6 /srv btrfs rw,relatime,space_cache,subvolid=264,subvol=/@/srv 0 0 /dev/sda6 /usr/local btrfs rw,relatime,space_cache,subvolid=266,subvol=/@/usr/local 0 0 /dev/sda6 /opt btrfs rw,relatime,space_cache,subvolid=263,subvol=/@/opt 0 0 /dev/sda6 /var/opt btrfs rw,relatime,space_cache,subvolid=276,subvol=/@/var/opt 0 0 /dev/sda6 /var/lib/mariadb btrfs rw,relatime,space_cache,subvolid=271,subvol=/@/var/lib/mariadb 0 0 /dev/sda6 /tmp btrfs rw,relatime,space_cache,subvolid=265,subvol=/@/tmp 0 0 /dev/sda6 /var/log btrfs rw,relatime,space_cache,subvolid=275,subvol=/@/var/log 0 0 /dev/sda6 /var/lib/mysql btrfs rw,relatime,space_cache,subvolid=272,subvol=/@/var/lib/mysql 0 0 /dev/sdb2 /media/data ext4 rw,relatime,data=ordered 0 0 tmpfs /run/user/473 tmpfs rw,nosuid,nodev,relatime,size=6589404k,mode=700,uid=473,gid=475 0 0 tmpfs /var/run/user/473 tmpfs rw,nosuid,nodev,relatime,size=6589404k,mode=700,uid=473,gid=475 0 0 tracefs /sys/kernel/debug/tracing tracefs rw,relatime 0 0 tmpfs /run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=6589404k,mode=700,uid=1000,gid=100 0 0 tmpfs /var/run/user/1000 tmpfs rw,nosuid,nodev,relatime,size=6589404k,mode=700,uid=1000,gid=100 0 0 gvfsd-fuse /run/user/1000/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=100 0 0 gvfsd-fuse /var/run/user/1000/gvfs fuse.gvfsd-fuse rw,nosuid,nodev,relatime,user_id=1000,group_id=100 0 0

...

...
I notice that the problem usually starts when I am building some package in a chroot using osc

Can you provide the steps for setting up this chroot? The package being built will help, if the developer can exactly reproduce what you're doing.

It is just the default build process using osc. I think it can be done by, for example: osc co home:Ronis_BR/julia cd home:Ronis_BR/julia osc build --root=`pwd`/jail openSUSE_Tumbleweed x86_64 After doing that, sometimes the problem appears. However, I still did not find any easy way to trigger it. Best regards, Ronan Arraes -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Carlos E. R.

20:25

On 2016-08-24 22:06, Ronan Arraes Jardim Chagas wrote:

...

It is just the default build process using osc. I think it can be done by, for example:

osc co home:Ronis_BR/julia cd home:Ronis_BR/julia osc build --root=`pwd`/jail openSUSE_Tumbleweed x86_64

After doing that, sometimes the problem appears. However, I still did not find any easy way to trigger it.

Try building the kernel locally, with make. The process creates a lot of small files. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)

Ronan Arraes Jardim Chagas

29 Aug 29 Aug

15:56

Hi guys, I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance. I'm resending here the information that was requested to me in BTRFS mailing list. /sys/fs/btrfs/$UUID/allocation/data ./bytes_may_use 0 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 36128374784 ./disk_total 37589352448 ./disk_used 36128374784 ./flags 1 ./total_bytes 37589352448 ./total_bytes_pinned 20339560448 ./single/total_bytes 37589352448 ./single/used_bytes 36128374784 /sys/fs/btrfs/$UUID/allocation/metadata ./bytes_may_use 84974452736 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 977354752 ./disk_total 4294967296 ./disk_used 1954709504 ./flags 4 ./total_bytes 2147483648 ./total_bytes_pinned -57851904 ./dup/total_bytes 2147483648 ./dup/used_bytes 977354752 # btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 39.07GiB Device unallocated: 1.22TiB Device missing: 0.00B Used: 35.29GiB Free (estimated): 1.22TiB (min: 625.93GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 320.00MiB (used: 0.00B) Data,single: Size:35.01GiB, Used:33.47GiB /dev/sda6 35.01GiB Metadata,DUP: Size:2.00GiB, Used:932.00MiB /dev/sda6 4.00GiB System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB Unallocated: /dev/sda6 1.22TiB # btrfs fi df / Data, single: total=35.01GiB, used=33.47GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=2.00GiB, used=932.09MiB GlobalReserve, single: total=320.00MiB, used=0.0 I also saw the following information in `journalctl`: Ago 29 10:25:33 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 10:25:33 ronanarraes-osd kernel: WARNING: CPU: 4 PID: 30424 at ../fs/btrfs/extent-tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 10:25:33 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 10:25:33 ronanarraes-osd kernel: CPU: 4 PID: 30424 Comm: kworker/u65:1 Tainted: P O 4.7.1-1-default #1 Ago 29 10:25:33 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 10:25:33 ronanarraes-osd kernel: Workqueue: writeback wb_workfn (flush-btrfs-1) Ago 29 10:25:33 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 10:25:33 ronanarraes-osd kernel: ffffffff8107ca1e ffff88100027c800 0000000000001000 ffff88082ff06400 Ago 29 10:25:33 ronanarraes-osd kernel: ffff88100c7af784 0000000000001000 ffff8805bd60f6cc ffffffffa025098e Ago 29 10:25:33 ronanarraes-osd kernel: Call Trace: Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa025098e>] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa026d036>] btrfs_clear_bit_hook+0x296/0x380 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028a755>] clear_state_bit+0x55/0x1d0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028aa0d>] __clear_extent_bit+0x13d/0x3f0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028b8d2>] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa0273722>] run_delalloc_nocow+0x962/0xba0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa0273cbf>] run_delalloc_range+0x35f/0x3b0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028c090>] writepage_delalloc.isra.40+0x100/0x170 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028e9d3>] __extent_writepage+0xc3/0x340 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028ee8b>] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028f4fe>] extent_writepages+0x4e/0x60 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123c64d>] __writeback_single_inode+0x3d/0x3b0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123ce8a>] writeback_sb_inodes+0x20a/0x440 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123d147>] __writeback_inodes_wb+0x87/0xb0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123d49d>] wb_writeback+0x28d/0x330 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123dbe2>] wb_workfn+0x222/0x3f0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff810950ed>] process_one_work+0x1ed/0x4e0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff81095427>] worker_thread+0x47/0x4c0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8109affd>] kthread+0xbd/0xe0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff816bb71f>] ret_from_fork+0x1f/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: Ago 29 10:25:33 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8109af40>] ? kthread_worker_fn+0x170/0x170 Ago 29 10:34:51 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 10:34:51 ronanarraes-osd kernel: WARNING: CPU: 6 PID: 27335 at ../fs/btrfs/inode.c:9306 btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 10:34:51 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 10:34:51 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 10:34:51 ronanarraes-osd kernel: CPU: 6 PID: 27335 Comm: Cache2 I/O Tainted: P W O 4.7.1-1-default #1 Ago 29 10:34:51 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 10:34:51 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 10:34:51 ronanarraes-osd kernel: ffffffff8107ca1e 0000000000000000 ffff88071b592a80 ffff881000221800 Ago 29 10:34:51 ronanarraes-osd kernel: 0000000000000000 ffff88071b592a80 00000000ffffff9c ffffffffa027dabf Ago 29 10:34:51 ronanarraes-osd kernel: Call Trace: Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffffa027dabf>] btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8121f6d1>] do_unlinkat+0x131/0x310 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff816bb4f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 10:34:51 ronanarraes-osd kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 10:34:51 ronanarraes-osd kernel: Ago 29 10:34:51 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 10:34:51 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a61 ]--- Ago 29 11:21:19 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 11:21:19 ronanarraes-osd kernel: WARNING: CPU: 18 PID: 16759 at ../fs/btrfs/extent-tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 11:21:19 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 11:21:19 ronanarraes-osd kernel: CPU: 18 PID: 16759 Comm: kworker/u65:2 Tainted: P W O 4.7.1-1-default #1 Ago 29 11:21:19 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 11:21:19 ronanarraes-osd kernel: Workqueue: writeback wb_workfn (flush-btrfs-1) Ago 29 11:21:19 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 11:21:19 ronanarraes-osd kernel: ffffffff8107ca1e ffff881000221800 0000000000001000 ffff88082ff06400 Ago 29 11:21:19 ronanarraes-osd kernel: ffff8807b11b6784 0000000000001000 ffff8806acb1f73c ffffffffa025098e Ago 29 11:21:19 ronanarraes-osd kernel: Call Trace: Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa025098e>] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa026d036>] btrfs_clear_bit_hook+0x296/0x380 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028a755>] clear_state_bit+0x55/0x1d0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028aa0d>] __clear_extent_bit+0x13d/0x3f0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028b8d2>] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa0272c19>] cow_file_range+0x299/0x440 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa0273cf2>] run_delalloc_range+0x392/0x3b0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028c090>] writepage_delalloc.isra.40+0x100/0x170 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028e9d3>] __extent_writepage+0xc3/0x340 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028ee8b>] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028f4fe>] extent_writepages+0x4e/0x60 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123c64d>] __writeback_single_inode+0x3d/0x3b0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123ce8a>] writeback_sb_inodes+0x20a/0x440 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123d147>] __writeback_inodes_wb+0x87/0xb0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123d49d>] wb_writeback+0x28d/0x330 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123dbe2>] wb_workfn+0x222/0x3f0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff810950ed>] process_one_work+0x1ed/0x4e0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff81095427>] worker_thread+0x47/0x4c0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8109affd>] kthread+0xbd/0xe0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff816bb71f>] ret_from_fork+0x1f/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: Ago 29 11:21:19 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8109af40>] ? kthread_worker_fn+0x170/0x170 Ago 29 11:21:19 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a62 ]--- Ago 29 12:06:07 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 12:06:07 ronanarraes-osd kernel: WARNING: CPU: 3 PID: 27335 at ../fs/btrfs/inode.c:9306 btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 12:06:07 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 12:06:07 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 12:06:07 ronanarraes-osd kernel: CPU: 3 PID: 27335 Comm: Cache2 I/O Tainted: P W O 4.7.1-1-default #1 Ago 29 12:06:07 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 12:06:07 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 12:06:07 ronanarraes-osd kernel: ffffffff8107ca1e 0000000000000000 ffff88071b5eeb00 ffff881000221800 Ago 29 12:06:07 ronanarraes-osd kernel: 0000000000000000 ffff88071b5eeb00 00000000ffffff9c ffffffffa027dabf Ago 29 12:06:07 ronanarraes-osd kernel: Call Trace: Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffffa027dabf>] btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8121f6d1>] do_unlinkat+0x131/0x310 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff816bb4f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 12:06:07 ronanarraes-osd kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 12:06:07 ronanarraes-osd kernel: Ago 29 12:06:07 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 12:06:07 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a63 ]--- Best regards, Ronan Arraes -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Jeff Mahoney

21:37

On 8/29/16 11:56 AM, Ronan Arraes Jardim Chagas wrote:

...

Hi guys,

I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance.

I'm resending here the information that was requested to me in BTRFS mailing list.

Hi Ronan - The only reason I saw this is because someone had forwarded it to me. As I mentioned on the btrfs list, since the Tumbleweed kernel is unpatched WRT btrfs, the btrfs list is the right place for this. I'll follow up there. -Jeff

...

/sys/fs/btrfs/$UUID/allocation/data

./bytes_may_use 0 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 36128374784 ./disk_total 37589352448 ./disk_used 36128374784 ./flags 1 ./total_bytes 37589352448 ./total_bytes_pinned 20339560448 ./single/total_bytes 37589352448 ./single/used_bytes 36128374784

/sys/fs/btrfs/$UUID/allocation/metadata

./bytes_may_use 84974452736 ./bytes_pinned 0 ./bytes_reserved 0 ./bytes_used 977354752 ./disk_total 4294967296 ./disk_used 1954709504 ./flags 4 ./total_bytes 2147483648 ./total_bytes_pinned -57851904 ./dup/total_bytes 2147483648 ./dup/used_bytes 977354752

# btrfs fi usage / Overall: Device size: 1.26TiB Device allocated: 39.07GiB Device unallocated: 1.22TiB Device missing: 0.00B Used: 35.29GiB Free (estimated): 1.22TiB (min: 625.93GiB) Data ratio: 1.00 Metadata ratio: 2.00 Global reserve: 320.00MiB (used: 0.00B)

Data,single: Size:35.01GiB, Used:33.47GiB /dev/sda6 35.01GiB

Metadata,DUP: Size:2.00GiB, Used:932.00MiB /dev/sda6 4.00GiB

System,DUP: Size:32.00MiB, Used:16.00KiB /dev/sda6 64.00MiB

Unallocated: /dev/sda6 1.22TiB

# btrfs fi df / Data, single: total=35.01GiB, used=33.47GiB System, DUP: total=32.00MiB, used=16.00KiB Metadata, DUP: total=2.00GiB, used=932.09MiB GlobalReserve, single: total=320.00MiB, used=0.0

I also saw the following information in `journalctl`:

Ago 29 10:25:33 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 10:25:33 ronanarraes-osd kernel: WARNING: CPU: 4 PID: 30424 at ../fs/btrfs/extent-tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 10:25:33 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 10:25:33 ronanarraes-osd kernel: CPU: 4 PID: 30424 Comm: kworker/u65:1 Tainted: P O 4.7.1-1-default #1 Ago 29 10:25:33 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 10:25:33 ronanarraes-osd kernel: Workqueue: writeback wb_workfn (flush-btrfs-1) Ago 29 10:25:33 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 10:25:33 ronanarraes-osd kernel: ffffffff8107ca1e ffff88100027c800 0000000000001000 ffff88082ff06400 Ago 29 10:25:33 ronanarraes-osd kernel: ffff88100c7af784 0000000000001000 ffff8805bd60f6cc ffffffffa025098e Ago 29 10:25:33 ronanarraes-osd kernel: Call Trace: Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa025098e>] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa026d036>] btrfs_clear_bit_hook+0x296/0x380 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028a755>] clear_state_bit+0x55/0x1d0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028aa0d>] __clear_extent_bit+0x13d/0x3f0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028b8d2>] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa0273722>] run_delalloc_nocow+0x962/0xba0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa0273cbf>] run_delalloc_range+0x35f/0x3b0 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028c090>] writepage_delalloc.isra.40+0x100/0x170 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028e9d3>] __extent_writepage+0xc3/0x340 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028ee8b>] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffffa028f4fe>] extent_writepages+0x4e/0x60 [btrfs] Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123c64d>] __writeback_single_inode+0x3d/0x3b0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123ce8a>] writeback_sb_inodes+0x20a/0x440 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123d147>] __writeback_inodes_wb+0x87/0xb0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123d49d>] wb_writeback+0x28d/0x330 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8123dbe2>] wb_workfn+0x222/0x3f0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff810950ed>] process_one_work+0x1ed/0x4e0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff81095427>] worker_thread+0x47/0x4c0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8109affd>] kthread+0xbd/0xe0 Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff816bb71f>] ret_from_fork+0x1f/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 Ago 29 10:25:33 ronanarraes-osd kernel: Ago 29 10:25:33 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 10:25:33 ronanarraes-osd kernel: [<ffffffff8109af40>] ? kthread_worker_fn+0x170/0x170

Ago 29 10:34:51 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 10:34:51 ronanarraes-osd kernel: WARNING: CPU: 6 PID: 27335 at ../fs/btrfs/inode.c:9306 btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 10:34:51 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 10:34:51 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 10:34:51 ronanarraes-osd kernel: CPU: 6 PID: 27335 Comm: Cache2 I/O Tainted: P W O 4.7.1-1-default #1 Ago 29 10:34:51 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 10:34:51 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 10:34:51 ronanarraes-osd kernel: ffffffff8107ca1e 0000000000000000 ffff88071b592a80 ffff881000221800 Ago 29 10:34:51 ronanarraes-osd kernel: 0000000000000000 ffff88071b592a80 00000000ffffff9c ffffffffa027dabf Ago 29 10:34:51 ronanarraes-osd kernel: Call Trace: Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffffa027dabf>] btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff8121f6d1>] do_unlinkat+0x131/0x310 Ago 29 10:34:51 ronanarraes-osd kernel: [<ffffffff816bb4f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 10:34:51 ronanarraes-osd kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 10:34:51 ronanarraes-osd kernel: Ago 29 10:34:51 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 10:34:51 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a61 ]---

Ago 29 11:21:19 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 11:21:19 ronanarraes-osd kernel: WARNING: CPU: 18 PID: 16759 at ../fs/btrfs/extent-tree.c:4303 btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 11:21:19 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 11:21:19 ronanarraes-osd kernel: CPU: 18 PID: 16759 Comm: kworker/u65:2 Tainted: P W O 4.7.1-1-default #1 Ago 29 11:21:19 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 11:21:19 ronanarraes-osd kernel: Workqueue: writeback wb_workfn (flush-btrfs-1) Ago 29 11:21:19 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 11:21:19 ronanarraes-osd kernel: ffffffff8107ca1e ffff881000221800 0000000000001000 ffff88082ff06400 Ago 29 11:21:19 ronanarraes-osd kernel: ffff8807b11b6784 0000000000001000 ffff8806acb1f73c ffffffffa025098e Ago 29 11:21:19 ronanarraes-osd kernel: Call Trace: Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa025098e>] btrfs_free_reserved_data_space_noquota+0xfe/0x110 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa026d036>] btrfs_clear_bit_hook+0x296/0x380 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028a755>] clear_state_bit+0x55/0x1d0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028aa0d>] __clear_extent_bit+0x13d/0x3f0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028b8d2>] extent_clear_unlock_delalloc+0x62/0x280 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa0272c19>] cow_file_range+0x299/0x440 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa0273cf2>] run_delalloc_range+0x392/0x3b0 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028c090>] writepage_delalloc.isra.40+0x100/0x170 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028e9d3>] __extent_writepage+0xc3/0x340 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028ee8b>] extent_write_cache_pages.isra.36.constprop.53+0x23b/0x350 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffffa028f4fe>] extent_writepages+0x4e/0x60 [btrfs] Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123c64d>] __writeback_single_inode+0x3d/0x3b0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123ce8a>] writeback_sb_inodes+0x20a/0x440 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123d147>] __writeback_inodes_wb+0x87/0xb0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123d49d>] wb_writeback+0x28d/0x330 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8123dbe2>] wb_workfn+0x222/0x3f0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff810950ed>] process_one_work+0x1ed/0x4e0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff81095427>] worker_thread+0x47/0x4c0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8109affd>] kthread+0xbd/0xe0 Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff816bb71f>] ret_from_fork+0x1f/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: DWARF2 unwinder stuck at ret_from_fork+0x1f/0x40 Ago 29 11:21:19 ronanarraes-osd kernel: Ago 29 11:21:19 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 11:21:19 ronanarraes-osd kernel: [<ffffffff8109af40>] ? kthread_worker_fn+0x170/0x170 Ago 29 11:21:19 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a62 ]---

Ago 29 12:06:07 ronanarraes-osd kernel: ------------[ cut here ]------- ----- Ago 29 12:06:07 ronanarraes-osd kernel: WARNING: CPU: 3 PID: 27335 at ../fs/btrfs/inode.c:9306 btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 12:06:07 ronanarraes-osd kernel: Modules linked in: fuse nf_log_ipv6 xt_pkttype nf_log_ipv4 nf_log_common xt_LOG xt_limit af_packet iscsi_ibft iscsi_boot_sysfs msr ip6t_REJECT nf_reject_ipv6 xt_tcpudp nf_ Ago 29 12:06:07 ronanarraes-osd kernel: mei_wdt sysimgblt iTCO_vendor_support i2c_i801 tpm_infineon tpm_tis tpm ioatdma crct10dif_pclmul crc32_pclmul ghash_clmulni_intel aesni_intel aes_x86_64 lrw sparse_keymap Ago 29 12:06:07 ronanarraes-osd kernel: CPU: 3 PID: 27335 Comm: Cache2 I/O Tainted: P W O 4.7.1-1-default #1 Ago 29 12:06:07 ronanarraes-osd kernel: Hardware name: Hewlett-Packard HP Z820 Workstation/158B, BIOS J63 v03.65 12/19/2013 Ago 29 12:06:07 ronanarraes-osd kernel: 0000000000000000 ffffffff81393104 0000000000000000 0000000000000000 Ago 29 12:06:07 ronanarraes-osd kernel: ffffffff8107ca1e 0000000000000000 ffff88071b5eeb00 ffff881000221800 Ago 29 12:06:07 ronanarraes-osd kernel: 0000000000000000 ffff88071b5eeb00 00000000ffffff9c ffffffffa027dabf Ago 29 12:06:07 ronanarraes-osd kernel: Call Trace: Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102ed5e>] dump_trace+0x5e/0x320 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102f12c>] show_stack_log_lvl+0x10c/0x180 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8102fe41>] show_stack+0x21/0x40 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff81393104>] dump_stack+0x5c/0x78 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8107ca1e>] __warn+0xbe/0xe0 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffffa027dabf>] btrfs_destroy_inode+0x23f/0x2b0 [btrfs] Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff8121f6d1>] do_unlinkat+0x131/0x310 Ago 29 12:06:07 ronanarraes-osd kernel: [<ffffffff816bb4f6>] entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 12:06:07 ronanarraes-osd kernel: DWARF2 unwinder stuck at entry_SYSCALL_64_fastpath+0x1e/0xa8 Ago 29 12:06:07 ronanarraes-osd kernel: Ago 29 12:06:07 ronanarraes-osd kernel: Leftover inexact backtrace: Ago 29 12:06:07 ronanarraes-osd kernel: ---[ end trace 5774bd3049f78a63 ]---

Best regards, Ronan Arraes

-- Jeff Mahoney SUSE Labs

Lindsay Mathieson

21:42

On 30/08/2016 1:56 AM, Ronan Arraes Jardim Chagas wrote:

...

I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance.

Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs? Certainly I wouldn't touch it for any production machines. -- Lindsay Mathieson -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Jeff Mahoney

30 Aug 30 Aug

02:25

On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...

On 30/08/2016 1:56 AM, Ronan Arraes Jardim Chagas wrote:

...
I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance.

Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop. We don't support device replace either, since those features aren't ready for prime time. Those "not supported" rules would apply to openSUSE too, but when I asked about adding the "allow_unsupported" option to openSUSE years ago, we never reached a consensus. The third is irrelevant for anyone not working on the code. Could it use improvement? Sure, but that's an ongoing process. For the most part, we're cleaning up the messes that came with trying to grow the developer base when it was still a skunkworks project. Lesson for future projects: don't do that. Lastly, the stability issues. I mostly see bug reports fall into a few buckets. The "OMG my file system is gone now" bugs haven't been common for a long time. Qgroups bugs are still a problem for a variety of reasons, but we're getting close to squashing the last of them. People do run into ENOSPC occasionally, but nowhere near as often as they did years ago. I think the main thing is that the number of users of btrfs has gone up drastically but the bugs reported haven't. -Jeff -- Jeff Mahoney SUSE Labs

Richard Brown

08:35

On 30 August 2016 at 04:25, Jeff Mahoney wrote:

...

On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...
On 30/08/2016 1:56 AM, Ronan Arraes Jardim Chagas wrote:

...
I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance.

Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop. We don't support device replace either, since those features aren't ready for prime time. Those "not supported" rules would apply to openSUSE too, but when I asked about adding the "allow_unsupported" option to openSUSE years ago, we never reached a consensus.

Further to this point - anyone relying on any software RAID5/6 solution, or any hardware solution without either an NV cache or battery backup, is fundamentally stupid and just asking to lose data. btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole. Jeff, with Leap being a lot more aligned with SLE, maybe it's time to resurrect the idea of implementing the same "not supported" rules there? I would fully support, without hestitation, "allow_unsupported" on Leap. On Tumbleweed I think there is room for more discussion, but even there I think it's best for openSUSE and btrfs to reflect the 'sane' and 'safe' features by default - if people want to go off the deepend, of course they still can.

...

Lastly, the stability issues. I mostly see bug reports fall into a few buckets. The "OMG my file system is gone now" bugs haven't been common for a long time. Qgroups bugs are still a problem for a variety of reasons, but we're getting close to squashing the last of them. People do run into ENOSPC occasionally, but nowhere near as often as they did years ago. I think the main thing is that the number of users of btrfs has gone up drastically but the bugs reported haven't.

+1 - I did some research on this a few weeks back, especially comparing zfs (everyones flavour of the month for fancy filesystems on Linux) to btrfs. For the same time period (the creation date of the btrfs bugtracker to date), btrfs has less open bugs, of less severity, and bugs that are reported seem to consistantly be closed more quickly. Looks to me like some people's perceptions of btrfs need to catch up with reality, hopefully with time that'll come. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Lindsay Mathieson

08:55

On 30/08/2016 6:35 PM, Richard Brown wrote:

...

btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole.

Incorrect, ZFS does not have the write hole. And RAID6 has one big advantage over RAID10 - it can always lose up to two drives without loosing data. With RAID10 if two drives are from the same mirror then all data is lost. And of course with RAIDZ3 you can lose up to three drives. -- Lindsay Mathieson -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Richard Brown

09:15

On 30 August 2016 at 10:55, Lindsay Mathieson wrote:

...

On 30/08/2016 6:35 PM, Richard Brown wrote:

...
btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole.

Incorrect, ZFS does not have the write hole.

And RAID6 has one big advantage over RAID10 - it can always lose up to two drives without loosing data. With RAID10 if two drives are from the same mirror then all data is lost.

And of course with RAIDZ3 you can lose up to three drives.

RAIDZ is not RAID5 - and before you claim I'm splitting hairs, there is a very good reason they called it RAIDZ besides branding ;) -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Lindsay Mathieson

09:17

On 30/08/2016 7:15 PM, Richard Brown wrote:

...

RAIDZ is not RAID5 - and before you claim I'm splitting hairs, there is a very good reason they called it RAIDZ besides branding;)

You brought up zfs, its functionally equivalent. -- Lindsay Mathieson -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Richard Brown

09:18

On 30 August 2016 at 11:17, Lindsay Mathieson wrote:

...

On 30/08/2016 7:15 PM, Richard Brown wrote:

...
RAIDZ is not RAID5 - and before you claim I'm splitting hairs, there is a very good reason they called it RAIDZ besides branding;)

You brought up zfs, its functionally equivalent.

I bought it up in the context of comparing the bug tracker health of btrfs I didn't bring it up in the context of functional equivalence or capability, after all that would be rather stupid, given zfs is not part of the openSUSE distribution and this is the opensuse-factory mailinglist. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Andrei Borzenkov

31 Aug 31 Aug

05:04

Отправлено с iPhone

...

30 авг. 2016 г., в 11:55, Lindsay Mathieson написал(а):

...
On 30/08/2016 6:35 PM, Richard Brown wrote: btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole.

Incorrect, ZFS does not have the write hole.

And RAID6 has one big advantage over RAID10 - it can always lose up to two drives without loosing data. With RAID10 if two drives are from the same mirror then all data is lost.

Note that btrfs RAID10 allows single failed drive - due to allocation pattern it is unpredictable whether loss of *any* second drive will result in data loss.

...

And of course with RAIDZ3 you can lose up to three drives.

-- Lindsay Mathieson

-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

-- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Chris Murphy

16:34

On Tue, Aug 30, 2016 at 11:04 PM, Andrei Borzenkov wrote:

...

Отправлено с iPhone

...
30 авг. 2016 г., в 11:55, Lindsay Mathieson написал(а):

...
On 30/08/2016 6:35 PM, Richard Brown wrote: btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole.

Incorrect, ZFS does not have the write hole.

And RAID6 has one big advantage over RAID10 - it can always lose up to two drives without loosing data. With RAID10 if two drives are from the same mirror then all data is lost.

Note that btrfs RAID10 allows single failed drive - due to allocation pattern it is unpredictable whether loss of *any* second drive will result in data loss.

Yes this could be a mkfs artifact. I found with four identically sized devices, the kernel code consistently allocates the same block group stripe to each device. But the mkfs allocate a different stripe initially. This puts two stripes on each device which causes this problem. It'd be nice to figure out whether the kernel allocator is in fact consistent (or just appears to be based on observation) and if so, use the same assignment logic for mkfs. At least with identically sized devices and both data and metadata chunks using a raid10 profile, it should look more like a conventional raid10. The gotcha though is that adding any device will invariably cause Btrfs kernel code to produce different allocation and now all bets are off again. Same if the metadata profile is single, DUP, or raid1 because in effect that makes the unallocated space different among the devices, which affects the block group allocation. So for the time being, in practice it is true that you can only for sure depend on 1 missing device with Btrfs raid1 or raid10, thus it is not scalable right now. -- Chris Murphy -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Jeff Mahoney

30 Aug 30 Aug

12:43

On 8/30/16 4:35 AM, Richard Brown wrote:

...

On 30 August 2016 at 04:25, Jeff Mahoney wrote:

...
On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...
On 30/08/2016 1:56 AM, Ronan Arraes Jardim Chagas wrote:

...
I just have the problem again. Now, it happens during the lunch time when the machine was idle. Only the system processes were running. It was not the first time that I saw this problem just after lunch when the machine stayed idle for a long period (+- 1h). This time, only the reboot worked, I could not run balance.

Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop. We don't support device replace either, since those features aren't ready for prime time. Those "not supported" rules would apply to openSUSE too, but when I asked about adding the "allow_unsupported" option to openSUSE years ago, we never reached a consensus.

Further to this point - anyone relying on any software RAID5/6 solution, or any hardware solution without either an NV cache or battery backup, is fundamentally stupid and just asking to lose data.

btrfs' RAID 5/6 implementation may not be one of the best ones out there, but all of them put data at risk thanks to the wonders of the write hole.

Jeff, with Leap being a lot more aligned with SLE, maybe it's time to resurrect the idea of implementing the same "not supported" rules there? I would fully support, without hestitation, "allow_unsupported" on Leap. On Tumbleweed I think there is room for more discussion, but even there I think it's best for openSUSE and btrfs to reflect the 'sane' and 'safe' features by default - if people want to go off the deepend, of course they still can.

Leap 42.2, being branched from the SLE12 SP2 kernel, does have the allow_unsupported flag already. As for Tumbleweed, I'd have no problem using it there too. I proposed it in 2013 but, as I said, we never reached an agreement on it. The allow_unsupported flag is what allows us to support btrfs as our default file system. Are there shaky parts of btrfs? Absolutely. That's why this flag exists -- so our users don't stumble into them by accident. The fact that people are trying and/or even pointing to raid5/6 as a problem for btrfs is a surprise to me simply because we just don't allow it on SLES. Late last night I proposed to some of the core btrfs developers that we add a similar flag to the mainline kernel. The flag would have a different name and possibly be per-mount instead of system-wide, but the idea would be generally the same: Make it clear when a feature is experimental and require explicit user action to enable it. -Jeff -- Jeff Mahoney SUSE Labs

Carlos E. R.

09:09

On 2016-08-30 04:25, Jeff Mahoney wrote:

...

On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...

...
Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop.

I assume you mean btrfs with RAID 5/6. Software raid with any other filesystem is Ok, right? -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)

Richard Brown

09:32

On 30 August 2016 at 11:09, Carlos E. R. wrote:

...

On 2016-08-30 04:25, Jeff Mahoney wrote:

...
On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...
...
Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop.

I assume you mean btrfs with RAID 5/6. Software raid with any other filesystem is Ok, right?

Ok is a matter of perspective. All RAID 5/6 implementations, including hardware ones, are at risk of a 'write hole', when something interupts the write. When this happens, the parity information doesn't match the rest of the data for the stripe. For hardware raid this is typically only during a power outage. Software RAID has an increased chance of this happening, due to increased opportunities for something to interupt that write, such as a kernel crash. Hardware RAID arrays mitigate this risk by using either non-volitile caches or battery backups to allow the writes to complete even in the event of a power outage. Software RAID has no such mitigations. So, RAID 5/6 always comes with an element of risk. A high-end hardware controller with NV cache and theoretically 'perfect' firmware has little risk. A well maintained hardware controller with a sysadmin regularly taking care that the battery is working has a small risk. and then it increases from there. And the btrfs RAID 5/6 implementation is probably the least safe of all the software RAID options available on openSUSE (and is therefore not available on SLE). But I personally wouldn't recommend any software RAID 5/6 of any type on any openSUSE filesystem for anyone who want to ensure their data is consistant and stays that way. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Carlos E. R.

09:57

On 2016-08-30 11:32, Richard Brown wrote:

...

On 30 August 2016 at 11:09, Carlos E. R. <> wrote:

...

...
I assume you mean btrfs with RAID 5/6. Software raid with any other filesystem is Ok, right?

Ok is a matter of perspective.

All RAID 5/6 implementations, including hardware ones, are at risk of a 'write hole', when something interupts the write. When this happens, the parity information doesn't match the rest of the data for the stripe.

Ah, ok, but this is not new.

...

For hardware raid this is typically only during a power outage. Software RAID has an increased chance of this happening, due to increased opportunities for something to interupt that write, such as a kernel crash.

Yes, fair enough :-)

...

Hardware RAID arrays mitigate this risk by using either non-volitile caches or battery backups to allow the writes to complete even in the event of a power outage.

Software RAID has no such mitigations.

Aha, yes. Still software raid has some advantages over fake hardware raids, in the the home (server) scenario. I was not considering serious implementations, with real hardware cards that include a battery backup. (For the record, I have always been against raid in home(server) scenarios. I advise instead people to dedicate the extra disk(s) for backups. I think that many people misunderstand what raids are for.)

...

So, RAID 5/6 always comes with an element of risk. A high-end hardware controller with NV cache and theoretically 'perfect' firmware has little risk. A well maintained hardware controller with a sysadmin regularly taking care that the battery is working has a small risk. and then it increases from there.

Understood.

...

And the btrfs RAID 5/6 implementation is probably the least safe of all the software RAID options available on openSUSE (and is therefore not available on SLE).

This is the new factor :-)

...

But I personally wouldn't recommend any software RAID 5/6 of any type on any openSUSE filesystem for anyone who want to ensure their data is consistant and stays that way.

But nothing new in this area. Nothing like "debacle over RAID 5/6" that Lindsay mentioned. Or is he referring to a new implementation of raid? :-? I thought I heard of a raid implementation peculiar to btrfs. -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)

Lindsay Mathieson

10:08

On 30/08/2016 7:57 PM, Carlos E. R. wrote:

...

But nothing new in this area. Nothing like "debacle over RAID 5/6" that Lindsay mentioned.

Or is he referring to a new implementation of raid? :-? I thought I heard of a raid implementation peculiar to btrfs.

"Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite" http://www.phoronix.com/scan.php?page=news_item&px=Btrfs-RAID-56-Is-Bad The btrfs scrub recalculates the wrong parity for raid 5/6 volumes, the bug has existed for years - probably the cause for a recovery failure bug the btrfs team have been trying to track down. -- Lindsay Mathieson -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Carlos E. R.

10:21

On 2016-08-30 12:08, Lindsay Mathieson wrote:

...

On 30/08/2016 7:57 PM, Carlos E. R. wrote:

...
But nothing new in this area. Nothing like "debacle over RAID 5/6" that Lindsay mentioned.

Or is he referring to a new implementation of raid? :-? I thought I heard of a raid implementation peculiar to btrfs.

"Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite"

http://www.phoronix.com/scan.php?page=news_item&px=Btrfs-RAID-56-Is-Bad

« It turns out the RAID5 and RAID6 code for the Btrfs file-system's built-in RAID support is faulty and users should not be making use of it if you care about your data. » It is "btrfs built-in raid", thus not the traditional Linux software raid implementation.

...

The btrfs scrub recalculates the wrong parity for raid 5/6 volumes, the bug has existed for years - probably the cause for a recovery failure bug the btrfs team have been trying to track down.

Thanks for this clarification :-) -- Cheers / Saludos, Carlos E. R. (from 13.1 x86_64 "Bottle" at Telcontar)

Chris Murphy

15:58

On Tue, Aug 30, 2016 at 4:08 AM, Lindsay Mathieson wrote:

...

On 30/08/2016 7:57 PM, Carlos E. R. wrote:

...
But nothing new in this area. Nothing like "debacle over RAID 5/6" that Lindsay mentioned.

Or is he referring to a new implementation of raid? :-? I thought I heard of a raid implementation peculiar to btrfs.

"Btrfs RAID 5/6 Code Found To Be Very Unsafe & Will Likely Require A Rewrite"

http://www.phoronix.com/scan.php?page=news_item&px=Btrfs-RAID-56-Is-Bad

The btrfs scrub recalculates the wrong parity for raid 5/6 volumes, the bug has existed for years - probably the cause for a recovery failure bug the btrfs team have been trying to track down.

This is overstated. Parity is not always recalculated incorrectly. It first requires a bad data strip, and second it requires a scrub, during which data csum mismatch happens, then the data is correctly reconstructed and repaired from good parity, and then *sometimes* bad parity is recomputed and written to disk. It's not clear that this happens passively (during normal reads), or with balance code, as it's only been reproduced with existing corruption of a data strip and during scrubs. And it doesn't cleanly reproduce, the original reproducer and I only found it happening about 1 in 3 attempts. So it's actually rather obscure, which in no way discounts how bad a bug it is. Next, Photonix doesn't bother to point out that Duncan isn't a developer, or kernel developer, or Btrfs developer (and neither am I). Duncan states this fact in fully half, or more, of his posts. So while I agree with his opinion that Btrfs raid56 is only suitable for testing, I can't agree with the broad statement that Btrfs raid56 is so problematic that it has to be totally scraped and rewritten. That isn't knowable to those unfamiliar with Btrfs code. It might very well be there are a handful of obscure yet treacherous bugs that can eventually be found and fixed. A much bigger problem, affecting more users, for all consumer drive configurations whether single drive or in some kind of raid (LVM, mdadm, Btrfs, and maybe even ZFS on Linux): http://www.spinics.net/lists/raid/msg52800.html http://www.spinics.net/lists/raid/msg53389.html It comes up probably once a week or two, and people lose data because of this (now) very common misconfiguration. Another big problem for Btrfs multiple device support is the lack of faulty device handling. Btrfs will continue to read and write to bad devices, indefinitely, spewing errors the entire time which itself sometimes causes problems (not least of which is the log spamming that ensues). There are patches available for testing upstream that aren't yet merged because they haven't had enough testing. So if people want this to get better, more will have to step up and do some testing with sane hardware setups, in simple configurations, without any other out of tree patches or kernel modules, and be completely OK with actually having to used their backups. In something like 7 years now, I've had no data loss with Btrfs. I have had a couple arrays implode such that they became read only, so they had to be recreated; and in both cases there was a user induced event that caused the degraded state to happen, during which a bug was triggered that results in the read only state. -- Chris Murphy -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Chris Murphy

14:46

On Tue, Aug 30, 2016 at 3:32 AM, Richard Brown wrote:

...

All RAID 5/6 implementations, including hardware ones, are at risk of a 'write hole', when something interupts the write. When this happens, the parity information doesn't match the rest of the data for the stripe.

mdadm now has a journal to avoid this problem. https://lwn.net/Articles/665299/ As for Btrfs, this has been asked a few times on the list, but those familiar enough with raid56 haven't replied. If full stripe writes are always CoW, and the stripe doesn't really "exit" until it's completely committed both in metadata as well as super block, then how does the write hole happen? If it happens, it suggests partial writes aren't CoW. And that has implications for csum consistency also, not just parity. OK so let's say partial stripe writes are overwrites, not CoW. When Btrfs runs into a bad data strip, either by drive read error or by data csum mismatch, Btrfs reconstructs the data from parity. It does this blindly because parity strips themselves are not checksummed. Upon reconstruction, it compares to data csum, and if there's a mismatch it reports an I/O error in kernel messages as well as up to user space. So does the write hole exist on Btrfs? If we agree to split the problem into two parts: the inconsistency between data and parity strips vs silent data corruption ensuing from reconstruction from bad parity: Btrfs does have the former problem, it does not have the latter. -- Chris Murphy -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Jeff Mahoney

12:36

On 8/30/16 5:09 AM, Carlos E. R. wrote:

...

On 2016-08-30 04:25, Jeff Mahoney wrote:

...
On 8/29/16 5:42 PM, Lindsay Mathieson wrote:

...
...
Given the debacle over RAID 5/6 and the ongoing issues with stability, and supposedly a very messy codebase, is btrfs still a wise default for installs?

Those are three separate issues. The first is, from our (SUSE) perspective, isn't an issue. We don't support RAID 5/6, full stop.

I assume you mean btrfs with RAID 5/6. Software raid with any other filesystem is Ok, right?

Yes, exactly. That's probably the single biggest reason we haven't invested more in btrfs RAID5/6 -- MD RAID5/6 is pretty bulletproof. -Jeff -- Jeff Mahoney SUSE Labs

Chris Murphy

25 Aug 25 Aug

23:53

On Wed, Aug 24, 2016 at 2:06 PM, Ronan Arraes Jardim Chagas wrote:

...

It is just the default build process using osc. I think it can be done by, for example:

osc co home:Ronis_BR/julia cd home:Ronis_BR/julia osc build --root=`pwd`/jail openSUSE_Tumbleweed x86_64

After doing that, sometimes the problem appears. However, I still did not find any easy way to trigger it.

OK I have some idea... There *is* a fairly reproducible case where making read only snapshots while there is heavy writing occurring, will result in enospc. Question, can you totally disable snapper and try to reproduce this again? Whatever is the longest time you've had in between enospc is how long you should run without snapshots *while* doing these builds. You can turn snapper back on to take some snapshots when you're doing building things. Try that a while and see if it seems to fix it. If it seems to fix it, try doing a build and then just make your own read only snapshot of /home at the CLI to try and trigger it. http://www.spinics.net/lists/linux-btrfs/msg52670.html -- Chris Murphy -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

Ronan Arraes Jardim Chagas

30 Aug 30 Aug

12:54

Hi! Em Qui, 2016-08-25 às 17:53 -0600, Chris Murphy escreveu:

...

OK I have some idea...

There *is* a fairly reproducible case where making read only snapshots while there is heavy writing occurring, will result in enospc.

Question, can you totally disable snapper and try to reproduce this again? Whatever is the longest time you've had in between enospc is how long you should run without snapshots *while* doing these builds. You can turn snapper back on to take some snapshots when you're doing building things.

Try that a while and see if it seems to fix it. If it seems to fix it, try doing a build and then just make your own read only snapshot of /home at the CLI to try and trigger it.

http://www.spinics.net/lists/linux-btrfs/msg52670.html

This appeared to describe correctly the behavior. Since during some local builds, I installed something with zypper that triggers a snapshot. However, yesterday the problem occurred after lunch, when the computed was idle for almost 1h. As of now, the problem seems completely random. Any ideas? Best regards, Ronan Arraes -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org

2792

Age (days ago)

2800

Last active (days ago)

List overview

Download

25 comments

7 participants

participants (7)

Andrei Borzenkov
Carlos E. R.
Chris Murphy
Jeff Mahoney
Lindsay Mathieson
Richard Brown
Ronan Arraes Jardim Chagas