How to verify that rsync copied the device correctly when copy-devices is enabled?
By default rsync compares files by size and timestamp, but a device does not have a size so it must calculate differences using the delta algorithm which is described in this tech report. Loosely, the remote file is divided into blocks of a chosen size, and the checksums of these are sent back. The local file is similarly checksummed in blocks, and compared with the list. The remote is then told how to reassemble the blocks it has to remake the file, and data for the blocks that do not match is sent.
You can see this by asking for debug output at level 3 just for the deltasum algorithm with option --debug=deltasum3
. You can specify a block size with -B
to simplify the numbers. For example, for a file that has already been copied once, a second run of
rsync -B 100000 --copy-devices -avv --debug=deltasum3 --no-W /dev/sdd /tmp/mysdd
produces output like this showing the checksum for each block:
count=164 rem=84000 blength=100000 s2length=2 flength=16384000
chunk[0] offset=0 len=100000 sum1=61f6893e
chunk[1] offset=100000 len=100000 sum1=32f30ba3
chunk[2] offset=200000 len=100000 sum1=45b1f9e5
...
You can then see it matching the checksums of the other device fairly trivially, since there are no differences:
potential match at 0 i=0 sum=61f6893e
match at 0 last_match=0 j=0 len=100000 n=0
potential match at 100000 i=1 sum=32f30ba3
match at 100000 last_match=100000 j=1 len=100000 n=0
potential match at 200000 i=2 sum=45b1f9e5
match at 200000 last_match=200000 j=2 len=100000 n=0
...
At the end the data=
field is 0, showing no new data was sent.
total: matches=164 hash_hits=164 false_alarms=0 data=0
If we now corrupt the copy by overwriting the middle of the file:
echo test | dd conv=block,notrunc seek=80 bs=100000 of=/tmp/mysdd
touch -r /dev/sdd /tmp/mysdd
then the rsync debug shows us a new checksum for block 80 but no match for it. We go from match 79 to match 81:
chunk[80] offset=8000000 len=100000 sum1=a73cccfe
...
potential match at 7900000 i=79 sum=58eabec6
match at 7900000 last_match=7900000 j=79 len=100000 n=0
potential match at 8100000 i=81 sum=eba488ba
match at 8100000 last_match=8000000 j=81 len=100000 n=100000
At the end we have data=100000
showing that a whole new data block had to be sent.
total: matches=163 hash_hits=385 false_alarms=0 data=100000
The number of matches has been reduced by 1, for the corrupt block checksum which failed to match. Perhaps the hash hits rise because we lost sequential matching.
If we look further in the same tech report, some test results are shown and the false alarms are described as "the number of times the 32 bit rolling checksum matched but the strong checksum did not". Each block has a simple checksum and an md5 checksum made (md4 in older versions). The simple checksum is easy to search for using a hash table as it is a 32 bit integer. Once it matches an entry, the longer 16 byte md5 checksum is also compared, and if it does not match it is a false alarm, and the search continues.
My example uses a very small (and old) usb key device of 16Mbytes, and the minimum hash table size is 2**16 i.e. 65536 entries, so it is pretty empty when holding the 164 chunk entries I have. So many false alarms are normal and more an indication of efficiency then anything else.