How to make differential backup in linux?

Not quite what you asked for, but very similar in effect (i.e., you "pay" storage only for files that actually have changed):

Using rsync, creating hard links for unchanged files.

The big advantage is that each "snapshot" is a full-fledged backup in its own right, i.e. on recovery you only have to restore that one snapshot (instead of recovering a base and its increments).

There is good documentation on that approach available at www.mikerubel.org/computers/rsync_snapshots/


Duplicity backs [up] directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. Because duplicity uses librsync, the incremental archives are space efficient and only record the parts of files that have changed since the last backup. Because duplicity uses GnuPG to encrypt and/or sign these archives, they will be safe from spying and/or modification by the server.

http://duplicity.nongnu.org/

Duplicity implements a traditional backup scheme, where the initial archive contains all information (full backup) and in the future only the changed information is added. However, here are some advantages it may have over other similar solutions:

  • Easy to use: Although duplicity is a command-line utility, the semantics are relative simply. To take a basic example, this command backs up the /usr directory to the remost host host.net via scp: duplicity /usr scp://host.net/target_dir

  • Encrypted and signed archives: The archives that duplicity produces can be encrypted and signed using GnuPG, the standard for free software cryptology. The remote location will not be able to infer much about the backups other than their size and when they are uploaded. Also, if the archives are modified on the remote side, this will be detected when restoring.

  • Bandwidth and space efficient: Duplicity uses the rsync algorithm so only the changed parts of files are sent to the archive when doing an incremental backup. For instance, if a long log file increases by just a few lines of text, a small diff will be sent to and saved in the archive. Other backup programs may save a complete copy of the file.

  • Standard file format: Although archive data will be encrypted, inside it is in standard GNU-tar format archives. A full backup contains normal tarballs, and incremental backups are tar archives of new files and the deltas from previous backups. The deltas are in the format produced by librsync's command-line utility rdiff. Although you should never have to look at a duplicity archive manually, if the need should arise they can be produced and processed using GnuPG, rdiff, and tar.

  • Choice of remote protocol: Duplicity does not make many demands on its archive server. As long as files can be saved to, read from, listed, and deleted from a location, that location can be used as a duplicity backend. Besides increasing choice for the user, it can make a server more secure, as clients only require minimal access.

Currently local file storage, scp/ssh, ftp, rsync, HSI, WebDAV, Tahoe-LAFS, and Amazon S3 are supported, and others shouldn't be difficult to add.

Tags:

Linux