I do not like Bacula, Backuppc has some issues, rdiff-backup is too crude, where to turn for rock-solid backups? Rsync-based Dirvish is making me very happy at the moment:
As a layer on top of rdiff-backup, Dirvish's basic operation is the same: the first backup rsyncs all non-excluded files and directories. (Obviously, this may be quite a chunk of time and bandwidth if we are talking about a remote server with a large file system.) Thereafter, each succeeding backup is an rsync of just the changes since the last backup. Adjacent backups appear as directories with the entire contents of the backup within, BUT files that are the same are hard linked, ie. only the delta is taking up actual storage space. This means that pruning old backups simply consists of deleting the directories they are contained in, with hard-linking automatically taking care of the house-keeping for remaining adjacent backups. And it means that maintaining a complete backup of a remote server is reasonable, since after the first backup only incremental changes go over the network on subsequent backups.
What Dirvish adds on top of rdiff-backup is:
On Debian-based systems the framework is already setup with a cronjob that will, by default, fire at 2200 and prune/refresh all backups. Configuration involves creating an /etc/dirvish/master.conf that orchestrates the whole process, and creating a set of directories to house your Dirvish "banks" (groupings of backups) and "vaults" (each "vault" is a directory tree on a specific machine). And then inside each "vault" adding a dirvish/default.conf file that gives specific direction for that particular backup.
For instance, /etc/dirvish/master.conf:
bank: /home/backups/dirvish-local /home/backups/dirvish-officeServer /home/backups/dirvish-ibmProductionServer exclude: lost+found/ core *~ .nfs* tmp /proc /sys /dev /var/cache/apt/archives/*.deb Runall: local-root 22:00 ibmFull 22:00 officeEtc 22:00 officeWWW 22:00 officeMySQL 22:00 expire-default: +15 days expire-rule: # MIN HR DOM MON DOW STRFTIME_FMT * * * * 1 +3 months # * * 1-7 * 1 +1 year * * 1-7 1,4,7,10 1 +1 year * 10-20 * * * +4 days # * * * * 2-7 +30 days
In the above, for instance, officeEtc / officeWWW / officeMySQL are all "vaults" for which I have created subdirectories under /home/backups/dirvish-officeServer. This file defines what is backed up, and a set of defaults for all "vaults". For instance, the above expire rules will keep 15 days of daily backups, three months of weekly backups, and one year of quarterly backups for all vaults, unless there are local vault rules that over-ride this behavior.
An example vault config file, /home/backups/dirvish-ibmProductionServer/ibmFull/dirvish/default.conf:
client: prodServer tree: / xdev: 0 index: gzip log: gzip image-default: %Y%m%d speed-limit: 100 exclude:
"prodServer" must appear in /etc/hosts or resolve via DNS. "tree: /" tells us we are backing up the whole server. "speed-limit: 100" uses the same syntax as rsync, and limits bandwidth to 100 kB/s.
Normally after setting up all the configuration files, one tests the configuration by setting the "tree" parameter to a relatively small subdirectory in each vault in turn, and running:
dirvish --vault --init
for each vault. This causes that tree to be backed up immediately, and you can see and correct any errors. After all is working, reset "tree" to the desired value, and check a couple days later to see if everything is working as expected.