Does btrfs have an efficient way to compare snapshots?
Solution 1:
btrfs send
, which appeared in Linux 3.6 (2012), "generates a stream of changes between two subvolume snapshots." You can use it just to produce a fast metadata comparison by adding the --no-data
flag.
btrfs send --no-data -p /snapshots/parent /snapshots/child
Normally, you would drop the --no-data
flag and pipe the output into btrfs receive
, to do incremental backups. For example, if /snapshots/parent
already exists at /backup/snapshots/parent
, btrfs send
would stream only those changes to the /backup
filesystem:
btrfs send -p /snapshots/parent /snapshots/child | btrfs receive /backup/snapshots
Solution 2:
I'm running Debian stable which does did not have btrfs send
, so I looked to a solution using btrfs subvolume find-new
.
Update:
btrfs send
was added in Linux 3.6, which was released in 2012 and included in Debian stable by 2015.
If you have snapshot1 and snapshot2 and you want to know what changed in the later one, snapshot 2, since snapshot1 was made you can use the script below which provides
btrfs-diff oldsnapshot/ newsnapshot/
which will list all files changed in newsnapshot/ since oldsnapshot/.
#!/bin/bash
usage() { echo $@ >2; echo "Usage: $0 <older-snapshot> <newer-snapshot>" >2; exit 1; }
[ $# -eq 2 ] || usage "Incorrect invocation";
SNAPSHOT_OLD=$1;
SNAPSHOT_NEW=$2;
[ -d $SNAPSHOT_OLD ] || usage "$SNAPSHOT_OLD does not exist";
[ -d $SNAPSHOT_NEW ] || usage "$SNAPSHOT_NEW does not exist";
OLD_TRANSID=`btrfs subvolume find-new "$SNAPSHOT_OLD" 9999999`
OLD_TRANSID=${OLD_TRANSID#transid marker was }
[ -n "$OLD_TRANSID" -a "$OLD_TRANSID" -gt 0 ] || usage "Failed to find generation for $SNAPSHOT_NEW"
btrfs subvolume find-new "$SNAPSHOT_NEW" $OLD_TRANSID | sed '$d' | cut -f17- -d' ' | sort | uniq
To explain: btrfs subvolume find-new
finds files changed after a particular 'generation' of snapshot. It also reports the current generation number.
Caveats
e.g. take the daily snapshot of a subvolume case:
mkdir test && cd test
btrfs subvolume create live
date >live/foo1
date >live/bar1
btrfs subvolume snapshot live/ snap1
date >live/foo2 # new file
date >>live/bar1 # modify file
rm live/foo1 # delete file
btrfs subvolume snapshot live/ snap2
date >live/foo3 # new file
mv live/bar{1,2} # rename file
rm live/foo2 # delete file
What changed between snap1 and snap2?
$ btrfs-diff snap1/ snap2/
bar1
foo2
So we can see the new file, see the modified file, but the deletion is not reported. This is because the command reports on files that exist, not ones that now don't.
What changed between snap2 and the live subvolume?
$ btrfs-diff snap2/ live/
foo3
the renamed file is not reported. Its data has not changed.
Now what if we add data to the renamed file
date >>live/bar2
btrfs-diff snap2/ live/
bar2
foo3
OK, makes sense. But let's make a new file
date >live/lala
btrfs-diff snap2/ live/
bar2
foo3
eh! where's lala?. If you add another file, lala
appears. So this behaviour is a bit odd. Which is probably why the wiki says:
The find-new approach has some serious limitations and thus is not really usable for something like send/receive.
However, the oddness comes when you compare a live subvolume against a previous state, not when you're comparing (read-only) snapshots. So this could still be useful unless you want to also identify deleted files.
Solution 3:
This is supported by the snapshot convenience tool snapper
.
sudo snapper -c config diff 445..446
Of course this requires you to be using snapper
for your snapshots.
This snapshot ids can be found using snapper list -a
. Unfortunately at the time of writing snapper did not support list snapshots for a single config, though these numbers can be found from subvolume names.