Sync LVM snapshots to backup server
Solution 1:
Although there are 'write-device' and 'copy-device' patches for RSync they only work well on small images (1-2GB). RSync will spend ages searching around for matching blocks on larger images and it's almost useless of 40GB or larger devices/files.
We use the following to perform a per 1MB checksum comparison and then simply copy the content if it doesn't match. We use this to backup servers on a virtual host in the USA to a backup system in the UK, over the public internet. Very little CPU activity and snapshot performance hit is only after hours:
Create snapshot:
lvcreate -i 2 -L 25G /dev/vg_kvm/company-exchange -n company-exchange-snap1
export dev1='/dev/mapper/vg_kvm-company--exchange--snap1';
export dev2='/dev/mapper/vg_kvm-company--exchange';
export remote='[email protected]';
Initial seeding:
dd if=$dev1 bs=100M | gzip -c -9 | ssh -i /root/.ssh/rsync_rsa $remote "gzip -dc | dd of=$dev2"
Incremental nightly backup (only sends changed blocks):
ssh -i /root/.ssh/rsync_rsa $remote "
perl -'MDigest::MD5 md5' -ne 'BEGIN{\$/=\1024};print md5(\$_)' $dev2 | lzop -c" |
lzop -dc | perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\1024};$b=md5($_);
read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 | lzop -c |
ssh -i /root/.ssh/rsync_rsa $remote "lzop -dc |
perl -ne 'BEGIN{\$/=\1} if (\$_ eq\"s\") {\$s++} else {if (\$s) {
seek STDOUT,\$s*1024,1; \$s=0}; read ARGV,\$buf,1024; print \$buf}' 1<> $dev2"
Remove snapshot:
lvremove -f company-exchange-snap1
Solution 2:
Standard rsync is missing this feature, but there is a patch for it in the rsync-patches tarball (copy-devices.diff) which can be downloaded from http://rsync.samba.org/ftp/rsync/ After appling and recompiling, you can rsync devices with the --copy-devices option.
Solution 3:
People interested in doing this specifically with LVM snapshots might like my lvmsync tool, which reads the list of changed blocks in a snapshot and sends just those changes.
Solution 4:
Take a look at Zumastor Linux Storage Project it implements "snapshot" backup using binary "rsync" via the ddsnap tool.
From the man-page:
ddsnap provides block device replication given a block level snapshot facility capable of holding multiple simultaneous snapshots efficiently. ddsnap can generate a list of snapshot chunks that differ between two snapshots, then send that difference over the wire. On a downstream server, write the updated data to a snapshotted block device.
Solution 5:
There's a python script called blocksync which is a simple way to synchronize two block devices over a network via ssh, only transferring the changes.
- Copy blocksync.py to the home directory on the remote host
- Make sure your remote user can either sudo or is root itself
- Make sure your local user (root?) can read the source device & ssh to the remote host
- Invoke:
python blocksync.py /dev/source user@remotehost /dev/dest
I've recently hacked on it to clean it up and change it to use the same fast-checksum algorithm as rsync (Adler-32).