Search code examples
linuxubuntursync

Backup using rsync


I have 2 identical drives. Let's call "S:" for source and "D:" for destination

S: is the drive I keep all my files in (images, music, videos, documents, etc), and D: is a backup HD I (manually) back up every Sunday night.

What I would like to do is, backup S: into D:, with a few rules.

Like, I said, I do backups once a week. This means that throughout the week, files get added, deleted and moved around from a folder to another.

  1. Only copy new files, or files that have been modified (would need to check file's metada)

  2. At the end of the back up, D: would have to end up being identical to S:.

Meaning, if I moved a file from folder "A" to folder "B" in S:, the back would see that the file is no longer in folder "A", and would have to delete it, to make that folder identical to S:.

step 2 was probably poorly explained., so here's a better explanation. This is how I plan on doing things if rsync can't do it.

In python, I would create a script that does the following (in order):

  1. Compares D: to S: - The script would first traverse D:. each time it enters a directory, it looks at that same directory in S:. It then looks at the files. If a file is in D: but not in S:, that means the file has been deleted or renamed or moved around in S:. Therefore, delete that file from D: (repeat this process for all folders)

  2. Now that D: have the exact same files (or less if they were deleted in steps above), start copying. First check if the current file in S: exists in D:, if not, then copy. If it does, check metadata. if it has been modified, copy and overwrite.


Solution

  • Here's a script I wrote to backup my linux machine to a USB drive.

    #!/bin/sh
    
    rsync -a \
      --progress \
      --hard-links \
      --whole-file \
      --delete \
      --delete-after \
      --delete-excluded \
      --stats \
      --filter='- *.log' \
      --filter='- /dev' \
      --filter='- /boot' \
      --filter='- /media/' \
      --filter='- /mnt' \
      --filter='- /net' \
      --filter='- /proc' \
      --filter='- /tmp/' \
      --filter='- /var/log/' \
      / /media/disk/middle-earth
    

    The --filter lines exclude files/subdirectories that I don't want to sync.

    You can use this as a starting point to craft your own.