Bug ID 1068647
Summary btrfs send bails out on orphan inode
Classification openSUSE
Product openSUSE Tumbleweed
Version Current
Hardware Other
OS Other
Status NEW
Severity Normal
Priority P5 - None
Component Kernel
Assignee kernel-maintainers@forge.provo.novell.com
Reporter martin.wilck@suse.com
QA Contact qa-bugs@suse.de
Found By ---
Blocker ---

I can't create backups of my currently active subvolume with btrfs send any
more. It proceeeds for several minutes, than bails out with the error message

   ERROR: send ioctl failed with -2: No such file or directory

This situation has persisted throughout several reboots, remounts, and kernel
updates. 

This first occured while I was doing lots of experiments with btrfs-send,
working on my "btrfs-clone" utility (https://github.com/mwilck/btrfs-clone) and
testing the efficiency of various cloning strategies within this tool. I made
several successful snapshots, until suddenly the described situation came to
pass which makes it impossible to do any more snapshots of the "main" subvol.
Thus it's possible that the situation was caused by btrfs-send. I admit I did
this while the subvolume was mounted. Of course I'd always set the cloned
subvolume and its parent to read-only mode while ran btrfs-send (it will abort
otherwise).

Except for the btrfs-send, no other problems have been observed so far. All
other tools (btrfs image, btrfs scrub, btrfs inspect-internal dump-tree)
silently ignored this situation.

I've traced the error down to the following call stack:

> send_subvol()
>   full_send_tree()
>     changed_cb [btrfs]()
>       finish_inode_if_needed [btrfs]()
>         get_inode_info.constprop.48 [btrfs]()
>         apply_children_dir_moves [btrfs]()
>         send_utimes [btrfs]()
>           get_cur_path [btrfs]()
>             __get_cur_name_and_parent [btrfs]()
>               is_inode_existent.part.15 [btrfs]()
>               get_first_ref [btrfs]()

get_first_ref() returns with -ENOENT.

I used some instrumentation to find out what was causing this problem, and
found the following situation in the file system tree:

root tree 28090400768(level 0) / 28058992640(1)
 - subvol @/.subvols/1/snapshot is item 89 at 28061384704

> root tree
> node 28058992640 level 1 items 5 free 488 generation 188885 owner 1
> fs uuid ad544c37-37f9-44a4-9798-d29d3cb5db44
> chunk uuid bb2d5d7b-6b0b-4f2d-9677-097711ef8572
>         key (EXTENT_TREE ROOT_ITEM 0) block 28090400768 (1714502) gen 188885
> 
> leaf 28090400768 items 98 free space 6117 generation 188885 owner 1
> leaf 28090400768 flags 0x1(WRITTEN) backref revision 1
> fs uuid ad544c37-37f9-44a4-9798-d29d3cb5db44
> chunk uuid bb2d5d7b-6b0b-4f2d-9677-097711ef8572
> 
>         item 5 key (ROOT_TREE_DIR INODE_ITEM 0) itemoff 14770 itemsize 160
>                 generation 3 transid 0 size 0 nbytes 16384
>                 block group 0 mode 40755 links 1 uid 0 gid 0 rdev 0
>                 sequence 0 flags 0x0(none)
>                 atime 1493728330.0 (2017-05-02 14:32:10)
>                 ctime 1493728330.0 (2017-05-02 14:32:10)
>                 mtime 1493728330.0 (2017-05-02 14:32:10)
>                 otime 1493728330.0 (2017-05-02 14:32:10)
> 
>         item 12 key (257 ROOT_ITEM 0) itemoff 13191 itemsize 439
>                 generation 180352 root_dirid 256 bytenr 36967874560 level 0 refs 1
>                 lastsnap 12 byte_limit 0 bytes_used 16384 flags 0x0(none)
>                 uuid d4ad4958-2cc1-004e-934d-f223ef4c5f2a
>                 ctransid 180352 otransid 8 stransid 0 rtransid 0
>                 drop key (0 UNKNOWN.0 0) level 0
>         item 13 key (257 ROOT_BACKREF 5) itemoff 13172 itemsize 19
>                 root backref key dirid 256 sequence 2 name @
>
>         item 33 key (258 ROOT_ITEM 0) itemoff 12290 itemsize 439
>                 generation 188511 root_dirid 256 bytenr 17933533184 level 1 refs 1
>                 lastsnap 0 byte_limit 0 bytes_used 163840 flags 0x0(none)
>                 uuid e99de79b-c494-9445-812b-d08389f82947
>                 ctransid 188511 otransid 11 stransid 0 rtransid 0
>                 drop key (0 UNKNOWN.0 0) level 0
>         item 34 key (258 ROOT_BACKREF 257) itemoff 12262 itemsize 28
>                 root backref key dirid 256 sequence 3 name .snapshots
> 
>         item 89 key (259 ROOT_ITEM 12) itemoff 10419 itemsize 439
>                 generation 188884 root_dirid 256 bytenr 28061384704 level 2 refs 1
>                 lastsnap 186971 byte_limit 0 bytes_used 259981312 flags 0x0(none)
>                 uuid 17c54832-5028-694c-9f97-1be0a1df8757
>                 parent_uuid d4ad4958-2cc1-004e-934d-f223ef4c5f2a
>                 ctransid 188884 otransid 12 stransid 0 rtransid 0
>                 drop key (0 UNKNOWN.0 0) level 0
>         item 90 key (259 ROOT_BACKREF 258) itemoff 10393 itemsize 26
>                 root backref key dirid 257 sequence 2 name snapshot

File system tree 28061384704
 - btrfs path 28052832256(level 0) / 28045197312(1) / 28061384704(2)
 - item 8 (1746132 INODE_ITEM 0) has "links 0" and no ref
 - item 86 is an ORPHAN node linking to item 8.

> file tree key (259 ROOT_ITEM 12) 
> node 28061384704 level 2 items 56 free 437 generation 188884 owner 259
> fs uuid ad544c37-37f9-44a4-9798-d29d3cb5db44
> chunk uuid bb2d5d7b-6b0b-4f2d-9677-097711ef8572
> 
>         key (1732417 INODE_REF 830626) block 28045197312 (1711743) gen 188860
> 
> node 28045197312 level 1 items 341 free 152 generation 188860 owner 259
> fs uuid ad544c37-37f9-44a4-9798-d29d3cb5db44
> chunk uuid bb2d5d7b-6b0b-4f2d-9677-097711ef8572
> ...
>         key (1746122 INODE_ITEM 0) block 28052832256 (1712209) gen 188821
> 
> leaf 28052832256 items 66 free space 8146 generation 188821 owner 259
> leaf 28052832256 flags 0x1(WRITTEN) backref revision 1
> fs uuid ad544c37-37f9-44a4-9798-d29d3cb5db44
> chunk uuid bb2d5d7b-6b0b-4f2d-9677-097711ef8572
> ...
>         item 7 key (1746123 EXTENT_DATA 4096) itemoff 15701 itemsize 53
>                 generation 188295 type 1 (regular)
>                 extent data disk byte 470392832 nr 4096
>                 extent data offset 0 nr 4096 ram 4096
>                 extent compression 0 (none)
>         item 8 key (1746132 INODE_ITEM 0) itemoff 15541 itemsize 160
>                 generation 188295 transid 188472 size 940 nbytes 4096
>                 block group 0 mode 100644 links 0 uid 465 gid 464 rdev 0
>                 sequence 16 flags 0x6(PREALLOC)
>                 atime 1510830949.715375999 (2017-11-16 12:15:49)
>                 ctime 1510838729.976180731 (2017-11-16 14:25:29)
>                 mtime 1510830949.687375946 (2017-11-16 12:15:49)
>                 otime 1510830949.687375946 (2017-11-16 12:15:49)
>         item 9 key (1746132 EXTENT_DATA 0) itemoff 15488 itemsize 53
>                 generation 188295 type 1 (regular)
>                 extent data disk byte 470614016 nr 4096
>                 extent data offset 0 nr 4096 ram 4096
>                 extent compression 0 (none)
>         item 10 key (1746213 INODE_ITEM 0) itemoff 15328 itemsize 160
>                 generation 188342 transid 188819 size 6498 nbytes 8192
>                 block group 0 mode 100600 links 1 uid 0 gid 0 rdev 0
>                 sequence 0 flags 0x4(none)
>                 atime 1510849204.24960545 (2017-11-16 17:20:04)
>                 ctime 1510833103.94629539 (2017-11-16 12:51:43)
>                 mtime 1510833103.86629524 (2017-11-16 12:51:43)
>                 otime 1510833037.158509427 (2017-11-16 12:50:37)
> ...
>         item 85 key (1746369 EXTENT_DATA 0) itemoff 7030 itemsize 53
>                 generation 188860 type 1 (regular)
>                 extent data disk byte 368373760 nr 4096
>                 extent data offset 0 nr 4096 ram 4096
>                 extent compression 0 (none)
>         item 86 key (ORPHAN ORPHAN_ITEM 1746132) itemoff 7030 itemsize 0
>                 orphan item

So, while trying to process inode 1746213, btrfs send runs
finish_inode_if_needed() on 1746132, and fails to send the utimes for this
inode because it can't figure out the path.

The way I'm reading the btrfs code, it seems that btrfs-send assumes that all
orphans have been cleaned up when it starts running. Indeed, btrfs should have
got rid of orphan nodes by running btrfs_orphan_cleanup() via
btrfs_cleanup_fs_roots() at mount time, but that doesn't seem to be the case.

In fact, I've run btrfs-debug tree on this file system two times with different
FS generations, and found that the orphan inode hat changed its position (inode
number) in the tree. Unfortunately I didn't keep the output of the first debug
run, but I'm positive that the inode number of the failing inode was in the
1743xxx range first, and is now 1746132.

Therefore I'm filing this as a kernel bug. The problem was first seen with
kernel 4.13.11-1.1.g0526da3 now I'm running 4.14.0-1.1.gab9e909.

There are 3 aspects to this problem:

 1) how did it come to pass?
 2) why doesn't the kernel clean up the orphan at boot time? Why does the
orphan change inode number? Or is the orphan actually cleaned, but a new one is
generated instead?
 3) how do I recover? I'd really love to get rid of this broken inode. I could
simply switch to a different snapshot of the file system, but AFAICS as long as
I don't delete the offending subvolume, the situation would persist. Anyway, as
long as 1) and 2) are open, I guess I'd rather keep the subvolume around to
facilitate further debugging. I'm unsure if this is easily reproducible.


You are receiving this mail because: