Re: [opensuse-factory] btrfs fun / disk full

24 Jan 2016

      On Fri, Jan 22, 2016 at 3:33 PM, Christian Boltz <opensuse@cboltz.de> wrote:
...
Hello,
I just had an interesting[tm] problem - updating to the latest
tumbleweed failed at random places.
It turned out that my btrfs / partition was full. Not with df -h (which
reported some GB free), but "btrfs fi show" showed 100% usage.
(Sorry for not including the exact output - I didn't save it.)
I'd say this shouldn't happen. But more information is needed to
understand what's happening and give an explanation. There are two
related issues that come up in these cases. 1. Near the time the
volume becomes close to full, large files are being written. This
means unallocated space is allocated as data chunks, which means they
can't be used for metadata. One or more files are deleted, and then
smaller files are written such as application or system updates, which
are metadata heavy changes. But existing metadata chunks don't have
enough space for these changes. Now the file system is considered full
even though there's unused space in data chunks. 2. Btrfs is a
copy-on-write file system which means even file deletion requires
space to write the change, since there's no overwriting. So it's even
possible to get into a rare but particularly annoying situation where
it's not possible to delete files. The work around for this is to add
a small device, delete files, balance with -dusage=15 (other values
will work OK too, this is a suggestion that should go pretty fast but
also free up a lot of space) then remove the device from the Btrfs
volume.

For space related problems, it's best to include in the post:

df
btrfs filesystem show
btrfs filesystem df
btrfs filesystem usage

It is a bit tedious. The idea is that df should be reliable, lots of
discussions have happened on the Btrfs list about it and the behavior
has changed a few times based on those conversations. And btrfs fi
usage is meant to be used to get more information than the (normal) df
command. The 'fi df' and 'fi show' subcommands are mainly used now for
troubleshooting and explaining behavior that don't meet expectations
from df and 'fi usage'. So usually for most users 'fi usage' should be
enough.
...
I moved 15 GB of libvirt images to a different partition and deleted
some old snapshots, but both didn't help.
After some searching (and temporaryly breaking my /, which I could
luckily repair with snapper rollback), I found out [1] that I should run
    btrfs balance start / -dlimit=3
which freed quite some space.
That's normal right now. The kernel code only deletes chunks when they
become completely empty.

Another run freed even more space, so I
...
decided to run it without the limit:
    btrfs balance start /
After that, I'm down to
# btrfs fi show
Label: none  uuid: 9f4a918d-fcd4-45d3-a1dc-7b887300eabc
     Total devices 1 FS bytes used 22.91GiB
     devid    1 size 50.00GiB used 25.31GiB path /dev/mapper/cboltz-root
Even if you re-add the 15 GB libvirt images in the calculation, this
still means the rebalance freed about 10 GB = 20% of the partition size.
So the good news is that I could solve the problem.
The bad news is that it happened at all.
I'd say the behavior you describe is becoming less common, is
suboptimal when it happens, and Btrfs is being improved.
...
Should there be a cronjob that does the rebalancing or other btrfs
maintenance regularly?
If something does this, it's really just masking the problem. I think
the developers would prefer for there to be some minor problems like
this and get user reports so they can try to fix and fine tune the
behavior rather than having it masked. It's been a challenging problem
to solve. So I can see why the maintenance script is provided but not
enabled by default.

-- 
Chris Murphy
-- 
To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org
To contact the owner, e-mail: opensuse-factory+owner@opensuse.org