[opensuse-factory] snapper snapshot comparison algorithm
Hello, I wanted to know what algorithm snapper uses for snapshot comparison? From what I have interpreted from the code, it first compares metadata and then does a block by block compare using 'memcmp'. I also wanted to know how the directories are 'diff'ed and if there is a scope of improvement in the current implementation. Also, if anyone has taken it up yet. Thank you. -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 12/2/12 11:16 PM, nafisa mandliwala wrote:
Hello, I wanted to know what algorithm snapper uses for snapshot comparison? From what I have interpreted from the code, it first compares metadata and then does a block by block compare using 'memcmp'. I also wanted to know how the directories are 'diff'ed and if there is a scope of improvement in the current implementation. Also, if anyone has taken it up yet.
Btrfs snapshots are mounted in the file system namespace under <fsroot>/.snapshots -- so the algorithm snapper currently uses is just 'diff' with a few more 'smarts' in order to avoid comparing files that haven't changed. As you've seen, that approach requires that we read essentially all of the metadata for the two trees being compared. The most i/o intensive part isn't comparing the files - but rather generating the list of changed files. It needs to list all the files in the tree and then stat them to see if they've changed between the snapshots. This is slow and only gets slower as the file system usage grows. There can't be anything special for directories since the directory's metadata doesn't change for all file metadata changes. E.g. growing a file doesn't change the directory so the directory's ctime doesn't get updated. Adding or removing a file does. Obviously, that's not optimal but, when snapper was first released, there weren't any better options. There is the btrfs subvolume find-new command but it doesn't show files that have been removed -- a big part of the use case (and user expectations) for snapshots. The approach we've already started implementing is to leverage the btrfs send code in the kernel to implement a new btrfs find-changed command that will list all the files that have been added, removed, changed, or had their metadata changed in some way. The send code already does the tree compare inside the kernel using the on-disk metadata format (rather than the abstract stat format exported to userspace), which includes the ability to recognize when entire subtrees can be skipped. - -Jeff - -- Jeff Mahoney SUSE Labs - -- Jeff Mahoney SUSE Labs -----BEGIN PGP SIGNATURE----- Version: GnuPG/MacGPG2 v2.0.18 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with undefined - http://www.enigmail.net/ iQIcBAEBAgAGBQJQvMi+AAoJEB57S2MheeWy3vAQALfVF2ils2CdGYrDBLB6QiFC eUUNkrLXv2Ps+dFZky8HKCX/+vxjbb+SLE0tWspHmUKQ44Arhb/cDikYqanAtEcS 7kzeL0/CJ/KeNFA1f2aHFQhVesINwu+Qs2xCN4mkbPmN2wytyGaDFxpNGGVO9oKQ aVF4sYcPRCHzmalKOxGZ8dJCmAFuDxivKdQEl5lVm+ul/JTeISJrG4+FoaX1tzvd Wy4ugCGX1dRt4Ie1+3xXX2FF6xhO5JZLlJprSfQH1Q18LNqtL7ZjKf/7e4ohakCm 0yak3rTjP+SDCSCyqbiQhi8PEUfmNgGX0DMUSp3x5zFbgnRs1YkTao6akt/01m+x vE1eQBQCNrr8V0Yu3Kwl6744BHKxGI4myaks26mVinV/iqc6y3PKh9ECsBjyRQ2n Fib1TU5MCcdFBCaIX/NvkpRARQUl54Q8JoJd4+vwmU8ejpP+QBtYFusD0uEiXUhm uTXnrdyRYJh64u0Ouey98880RUUbL1rR9K5AdMsshCsWZRHrV+VzFvxj8pSVa66x bjrKcrJ8JA9W2OxHM/eQgpEmPhRDGuVtbtKlYes/Bzs9W1rKYsJodQWgyfoJNeb5 uwEUHW1Z4jWB3OWMLUihsS7av2Jwb3hfKG50vAwmv1diwbur5zF6C71THCVHbTec GjuYX/gVX0tma06ASaiI =0GVX -----END PGP SIGNATURE----- -- To unsubscribe, e-mail: opensuse-factory+unsubscribe@opensuse.org To contact the owner, e-mail: opensuse-factory+owner@opensuse.org
participants (2)
-
Jeff Mahoney
-
nafisa mandliwala