Codebox Software

Linux/UNIX Backup Script

Published:

This shell script is useful for backing up important files and storing them remotely in an encrypted archive. Many ISPs give their customers some free webspace on a server for personal web pages, if you're not using yours then why not turn it into an offsite backup location?

#!/bin/sh 
### Change all these
BACKUP_LIST=~rob/backup.list
EXCLUDE_FILE=~rob/backup.exclude
OUTPUT_DIR=~rob
OUTPUT_FILE=backup.tar
CRYPT_KEY="y0uR cRypT!KeY in H3re"
FTP_USER=rob@ftpbox
FTP_PASS=secret_password
FTP_SERVER=files.myisp.com
FTP_DIR=backupdir

### Dont change these
FTP_OK_MSG="^226 "
FTP_LOG=$0.ftp.log
OUTPUT_ZIPFILE=$OUTPUT_FILE.gz
OUTPUT_ENCRYPTED=$OUTPUT_ZIPFILE.gpg

doBackup(){
    SOURCE=$1
    if [ ! -e $SOURCE ]; then
        echo "$0 WARNING file $SOURCE could not be found" 1>&2
    else 
        echo backing up $SOURCE to $OUTPUT_FILE
        tar -rPf $OUTPUT_FILE --exclude-from=$EXCLUDE_FILE $SOURCE 
    fi
}

reportFileSize(){
    FILE=$1
    MSG=$2
    echo $MSG $FILE is `ls -l $FILE | cut -f5 -d' '` bytes
}

###################################
# Prepare everything...
###################################
cd $OUTPUT_DIR

rm -f $OUTPUT_FILE
rm -f $OUTPUT_ZIPFILE
rm -f $OUTPUT_ENCRYPTED
rm -f $FTP_LOG

if [ ! -e $BACKUP_LIST ]; then
    echo "$0 could not find the backup list $BACKUP_LIST" 1>&2
    exit 1
fi

if [ ! -e $EXCLUDE_FILE ]; then
    # We need the file to exist otherwise the tar command fails
    touch $EXCLUDE_FILE
fi

###################################
# Backup the files into an archive and compress it
###################################
echo Running backup with the following excludes...
cat $EXCLUDE_FILE

# Create the archive and put a copy of the backup list into it
tar -cPf $OUTPUT_FILE $BACKUP_LIST

# Read the entries from the BACKUP_LIST file, and add each one into the archive
while read ENTRY
do
    doBackup $ENTRY
done < $BACKUP_LIST

reportFileSize $OUTPUT_FILE "Before compression"

# Compress the archive
gzip $OUTPUT_FILE 

reportFileSize $OUTPUT_ZIPFILE "After compression"

###################################
# Encrypt backup file
###################################
gpg -c --passphrase "$CRYPT_KEY" $OUTPUT_ZIPFILE

reportFileSize $OUTPUT_ENCRYPTED "After encryption"

###################################
# FTP backup file
###################################
ftp -nv $FTP_SERVER > $FTP_LOG << EOF
    user $FTP_USER $FTP_PASS
    cd $FTP_DIR
    put $OUTPUT_ENCRYPTED
    bye
EOF

OK_MSG_COUNT=`grep -c "$FTP_OK_MSG" $FTP_LOG`
if [ $OK_MSG_COUNT = 1 ]; then
    echo FTP transfer completed ok
    EXIT_CODE=0
else
    echo FTP transfer failed! 1>&2
    cat $FTP_LOG 1>&2
    EXIT_CODE=1
fi

###################################
# Clean up and exit (leave the zipped backup file in place)
###################################
rm -f $OUTPUT_FILE
rm -f $OUTPUT_ENCRYPTED
rm -f $FTP_LOG

exit $EXIT_CODE

Notes

To use the script you will need to change the 9 values indicated, as follows:

  • BACKUP_LIST - the name and location of a file containing the list of things that you want to back up. The list can include individual files, or full directories, each item should be on a line by itself:
    /etc
    /var/www
    /home/rob/plans/world_domination.txt
    /home/rob/docs
  • EXCLUDE_FILE - the name and location of another file, in the same format as BACKUP_LIST, but containing things that are included in the backup list that you don't actually want to backup. This sounds odd at first, but actually makes sense - say for example you want to back up a directory called /var/www containing all your websites, but you don't want to back up all the logs which are stored in /var/www/logs. By including an entry for /var/www in your BACKUP_LIST file, and including /var/www/logs in your EXCLUDE_FILE file, you will be backing up everything in that directory except for the logs. This is simpler than individually naming each of the sub-directories of /var/www that you want to include, and is also more future-proof - if you add a new website next month it will get backed up without you having to remember to add it into your list.
  • OUTPUT_DIR - this is the directory on your machine that the script will use for creating various temporary files, and where it will leave an unencrypted copy of the archive that gets sent to the FTP server. You just need to make sure that the script is able to write to whatever location you specify here, and that there is sufficient space available to create all the files.
  • OUTPUT_FILE - the name of the file that will contain your backup, it is recommended that you end the file name with '.tar'. Note that once the backup has been compressed and encrypted the name will have .gz.gpg appended to it, so if you specify an OUTPUT_FILE of backup.tar, the file that gets copied to the FTP server will actually be called backup.tar.gz.gpg
  • CRYPT_KEY - the key that will be used to encrypt the backup file before it gets sent over FTP. You will need this key to decrypt the file, so make it something you can remember, and bear in mind all the usual advice about not using dictionary words, mixing the case of the letters etc.
  • FTP_USER - the username for your account on the FTP server
  • FTP_PASS - the password for your account on the FTP server
  • FTP_SERVER - the IP address or hostname of the FTP server
  • FTP_DIR - the directory on the FTP server where you want the backup file to be stored, if you don't want to change directory before storing the file then just set this to a full-stop (a period)

Depending on your system, you may also need to install the gpg utility to perform the encryption.

To decrypt the backup file, just use the gpg utility against the encrypted archive like this (entering your key when prompted to do so):

gpg backup.tar.gz.gpg

It should be obvious that this script is NOT very secure, it contains both the crypto-key for your backup, and the password for your FTP account in plaintext. As a minimum you should change the permissions on the script file so that only you have read- and execute-access to it. Also bear in mind that because the crypto-key is passed to gpg as a command-line parameter, the key will be visible in the process list of your system (accessible via ps -ef) while the encryption command is running.