HowTo: Backup your sites with Rsync

kblessinggr

PedoBeard
Sep 15, 2008
5,723
80
0
G.R., Michigan
www.kbeezie.com
First - Get some sort off-server location that you can backup your stuff on a regular basis, I recommend ServerSync, they provide Raid-6 (2-drive redundant failure protection) with no transfer limit, and they're pretty cheap, 5$ a month for 20GB of storage or 10$ for 50GB (I'm doing the 50GB plan). Can be FTP'd, SSH, SCP, Rsync, etc.

Second - Prepare your remote location to receive the backup, for example, on mine I just simply created a folder called backup, followed by the server name, resulting in /home/kbeezie/backup/kbeezie.com/.

Third - create a bash script to backup your databases, followed by actually committing the rsync process. Make a file , say backup.sh
(repeat the mysqldump line for as many databases as you have)

Code:
#!/bin/sh

date=`date '+%m-%d-%y'`
mysqldump -u database_username -pdatabase_password database_name > ~/mysql_backup/database_name.$date
rsync -e ssh -az /home/username/ username@serversync-ip:/home/username/backup/server-name/

we'll need to make sure the script can be executed so on the server after uploading it (preferably to your home folder, but not in a publicly accessible folder such as where your website files lie, since you'll have database passwords in the shell script)

Code:
chmod +x backup.sh

We'll then want to setup a cron job, to perform this backup nightly, lets say 3AM based on the server time.

Code:
crontab -e

Then type in the following (if it pulls up vi, press 'i' to insert text, then ESC to get out of insert mode, then shift+zz to save and exit)

Code:
0 3 * * * /path/to/backup.sh > /dev/null 2>&1

The last bit silences the output.

If you wish to manually execute the backup script you can do so by directly calling it.

Code:
/path/to/backup.sh

If you wish to just run the rsync command and see progress as it runs, do the following.

Code:
rsync -e ssh -avvz --progress /home/username/ username@serversync-ip:/home/username/backup/server-name/

example

On my personal VPS, I run an nginx webserver, no control panel, all my websites are store under /opt/html, my database backups get store under /opt/mysql_backup, so the above script would be something like this.

Code:
#!/bin/sh

date=`date '+%m-%d-%y'`
mysqldump -u database_username -pdatabase_password database_name > /opt/mysql_backup/database_name.$date
#times the number of extra mysql databases
rsync -e ssh -az /opt/html/ username@serversync-ip:/home/username/backup/server-name/

The first time it runs it'll transfer every file I have under /opt/html, every other time it runs, it'll only transfer files that have been changed (in which case the database backups will always transfer because they're new files with new dates, remove the date from the file name if you want the backups to overwrite the latest mysql backup in the same folder.)

if by some chance your mysql files are huge you can add this just above the rsync line of your bash script (which I may not have correct, so any like minded folks correct me if I'm wrong)

Code:
tar zcf /path/to/mysql_backup/mysql_$date.tar.gz /path/to/mysql_backup/*.$date
rm -Rf /path/to/mysql_backup/*.$date

The above will compress all the mysql backups of the date originally dumped, then remove the original dumps leaving the tar gzip compressed file ready to be sync'd

PS: Anyone signing up for IonVz can have this done for them as a signup bonus (or as a regular part of managed support, so you have your own backup on top of the backup we already do).

The Serversync people are also very helpful if you have any questions about how to get something backed up, or the various ways to connect to your serversync account, including how to mount your remote storage as a drive on your system.
 
  • Like
Reactions: bobsoap


PS: yes you could use the --all-databases option to mysqldump to just simply dump all the databases you have into a single file using your mysql root password, but when it comes to restoring a specific database, it makes it somewhat difficult to do so without also resetting your other databases that may not need restoring.
 
If you really want to do it right, then also add a step to keep old data in reverse increments.

Drive failure is not the only thing you need to worry about. Sometimes, you just fuck up and delete your own shit by mistake. And sometimes, you only notice you've deleted shit that was important after a while. Alternatively, shit gets corrupted, and it might take you a couple of days to figure out that something isn't right.

So keeping just the latest backup isn't the best way to do it.

All posix systems support hard-linking. You can take advantage of that by creating snapshots of only the latest data over the old data. That way, if you want to keep copies of your daily backups for the last, say, 10 days, you don't actually need 10x space. You only need enough space to store the most recent backup + the files that were changed.

I'm not going to bore you with the details. But here is a script that, if executed prior to rsync-ing on the backup host, will preserve your previous day's data.

Code:
cd /my_bakcup_dir

rm -R -f 005
mv 004 005
mv 003 004
mv 002 003
mv 001 002
cp -a -l 000 001

if [ `date +%F | sed 's/^.*-//'` = "01" ]
then
    rm -R -f monthly
    cp -a -l 000 monthly
fi
And then, you would rsync your data into /my_bakcup_dir/000.

That way, if you need to get files from 2 days ago, you can look into 002. And monthly, will get you the data from the end of the previous month. Each directory contains the entire backup as it was on the day you made it. So no need to piece shit together like with standard incremental backups.

I usually use 5 days, but you can go as far back as you want.

"cp -a -l" does a copy by hard links. In other words, it doesn't copy the files, but creates hard links. (If you don't know what a hard link is, then don't worry about it. Think of it as just two pointers to the same file, pointing from different directories.)

Rsync, in turn, doesn't update the files. If some file is changed, it uploads the (parts) of new copy, then builds the entire file, then removes the old one, and renames the new one to take its place.

By doing so, it effectively removes the link from the 000, while the link in 001 still exists, and the old file is not modified. So for any file that was changed in the previous day, the old copy is still in 001. While 000 contains the latest data.

To completely automate it, include the call to the remote copy/rotate script from within your backup script. You can do that through ssh.

Like,

Code:
ssh user@box "/home/backup/the_rotate_script"
...
rsync command
That will call the rotation script on the backup host, and once finished will rsync the latest data onto it.
 
Thanks for that bcc423 , the idea of making increments of days past did occur to me, but was afraid most of the people who would have to follow the instructions might get confused to hell with that extra step, but that script should help. (course would you have even posted that had I not started it off :D )

And correction, if you really want to do it right, you hire someone to do that kind of stuff instead of attempting to do it on your own. *grin*
 
thanks.. I've been avoiding this for awhile, this little script will sure make it easier