How to back-up one big file with small changes?
The rsync program does exactly that. From the man page:
It is famous for its delta-transfer algorithm, which reduces the amount of data sent over the network by sending only the differences between the source files and the existing files in the destination. Rsync is widely used for backups and mirroring and as an improved copy command for everyday use.
You probably want a modern deduplicating backup program. Check out BorgBackup.
This will make multiple backups of each version of your large file, but will share the common content between the different versions, so the total space used for a given version of your large file will only be slightly more than the total disk space for a single version, assuming that the different versions only differ slightly.
If you're IO-limited in any way, use a filesystem such as BTRFS or ZFS that directly supports incremental backups without having to find the differences in files, such as what rsync
has to do.
Using rsync
is going to be slow and very IO-intensive.
Because if whatever application is writing changes to the files is in any way IO-limited, using rsync
is going to take significant IO cycles away from the application that's the very reason the files exist. And if your backup process or system is IO-limited, rsync
is going to take IO cycles away from your available backup bandwidth.
Just Google "rsync is slow". For example: rsync is very slow (factor 8 to 10) compared to cp on copying files from nfs-share to local dir