I had to set up proper full backups for a client website, S3 seemed like an economical way to go (time will tell). Duplicity is a script which uses rsync to back up incrementally, which will save data transfer. It can also encrypt the data and a bunch of other goodies.
Installing Duplicity
Since the site was running on Centos (Debian forever!) and the Yum repositories don’t have Duplicity, I had to add some the rpmforge repository:
wget http://packages.sw.be/rpmforge-release/rpmforge-release-0.5.2-2.el5.rf.x86_64.rpm
rpm --import http://apt.sw.be/RPM-GPG-KEY.dag.txt
rpm -K rpmforge-release-0.5.2-2.el5.rf.*.rpm
rpm -i rpmforge-release-0.5.2-2.el5.rf.*.rpm
Now duplicity will install:
yum install duplicity
It also requires Google’s boto library
yum install boto
Make the backup script
Now make a shell script something like this:
export PASSPHRASE=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX export AWS_ACCESS_KEY_ID=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX export AWS_SECRET_ACCESS_KEY=XXXXXXXXXXXXXXXXXXXXXXXXXXXXX dirs="/var/svn \ /var/www \ /home \ /usr/local/stuff" for d in $dirs; do prefix=$(basename $d) echo duplicity $d s3+http://backup.bucket.name/$prefix duplicity --full-if-older-than 30D $d s3+http://backup.bucket.name/$prefix echo "" done
PASSPHRASE can be anything you want, just make it long so the encryption is strong.
The keys come from AWS – see https://portal.aws.amazon.com/gp/aws/securityCredentials
“–full-if-older-than 30D” will make duplicity perform a full backup every 30 days. Otherwise it does incremental backups.
You can also make a restore script by swapping the duplicity arguments
duplicity s3+http://backup.bucket.name/$prefix $d
Add the script to cron
0 0 * * * /root/scripts/etc/backup.sh >>/var/log/duplicity/etc.log
References
- http://www.blogbyben.com/2011/01/setting-up-s3-backup-solution-on-centos.html
- http://www.brainonfire.net/blog/remote-encrypted-backup-duplicity-amazon-s3/
- http://wiki.kartbuilding.net/index.php/Duplicity_-_secure_incremental_backup
- http://icelab.com.au/articles/easy-server-backups-to-amazon-s3-with-duplicity/
I’m setting up a backup solution like this one with DreamHost/DreamObjects which also use the S3 protocol. This guide saved my day since I didn’t find the needed AWS_* variables in the duplicity documentation, thanks!