On 9/2/16 10:43 PM, Ronan Arraes Jardim Chagas wrote:
Hi Jeff!
Thanks for the enlightening e-mail. It was clear that the qgroups were not the root cause of my issue as you said. I can confirm that even without this feature, the bug still happens in the same frequency.
I am glad to know a little bit more how things work with SLE and, now, with Leap. Unfortunately, I will also not be able to change my distro from TW to Leap because I need kernel 4.6 at least. But I do have a side question: I understood that SLE and Leap do not seem to have any major problems with qgroups, but, since Tumbleweed use almost a vanilla kernel w.r.t. BTRFS, should it can also be considered stable there?
Yes. The thing with the SLES btrfs implementation is that it tends to differ broadly from the kernel version number that uname -r shows but not very much from upstream. In fact, the only patches we have applied to the 4.4-based SLE12-SP2/Leap kernel are backports and patches we've submitted upstream, plus the patch to disable unsupported features and the patches to export the per-root anon device via stat(). So, yes, I expect that qgroups should be stable on Tumbleweed as well but I haven't had time to test it the way we have for SLE12 SP2.
Back to my bug, one thing that **always** happens is that after the discussed problem, the metadata increases **a lot** (I saw something like 80 GiB once). Hence, is there any kind of configuration to fix the minimum metadata size? If it is, do you think that I can get rid of those ENOSPC by fixing a very high limit, such as 300 GiB?
The minimum metadata size? I'm not sure what you mean here. Usually the weird ENOSPC cases are miscalculated reservations, or perhaps not taking into account that new block groups can be created.
Ah! I think I forgot to mention one very important thing: I have been using Tumbleweed+BTRFS on this machine for a very very very long time. I think I installed it just after it changed to the current model. By that time, I was using the same machine but without one peripheral that requires a "new" kernel (HDD, processor, RAM, everything was the same). AFAIK, the first time I saw that problem was this year. So, I think it must be a regression after some kernel / btrfs-progs update.
I expect that it's a regression as well, but without diagnosing it, I can't really say. -Jeff -- Jeff Mahoney SUSE Labs