Search code examples
linuxbackuprsync

Why did Rsync overwrite my operating system?


I've been running Ubuntu Server on a little Intel NUC for the past few years. It has two external hard drives attached via USB: Media and MediaBackup.

The external hard drives had long ago been added to fstab to auto-mount at boot:

echo "/dev/sdb1 /media/Media ext4 user 0 0" | sudo tee -a /etc/fstab
echo "/dev/sdc1 /media/MediaBackup ext4 user 0 0" | sudo tee -a /etc/fstab

At some point, I wanted to automate a nightly backup of my Media directory, so I passed a command along to cron:

echo "00 1 * * * root rsync -av --delete /media/Media/ /media/MediaBackup/" | sudo tee -a /etc/cron.d/backup-media

This ran reliably and without issue for a little more than a year.

One morning, I found that I couldn't SSH into the NUC, or do anything with it. After booting into the NUC with a live USB, I realized that rsync had completely overwritten my OS at /dev/sda1.

Why did rsync overwrite my OS, and how can I prevent it from doing so again?

Edit

As @thatotherguy pointed out in the comments, the scripting in my original question was malformed in several ways. It was based off a setup.sh script I've kept on GitHub for a year or so. The script reflected my best guess at the time on how to re-create the server's state on a fresh installation, but I never actually used it.

I've corrected the parts that would make the script fail, but left in what seem like the two critical mistakes:

  1. Executing the rsync backup as root, as everyone pointed out, and
  2. As @Slawomir Dziuba pointed out, mounting the hard drives in fstab by their SCSI disk assignments (/dev/sda, /dev/sdb), instead of their UUIDs

After a good bit of reading up on fstab, it looks like @Slawomir Dziuba's response about using UUIDs, and everyone's response about not executing as root, are probably good answers to "how can I prevent it from doing so again"?

I'll keep this question up for a few days in case of any of the commenters wants to answer it. Not sure if it's a good candidate to remain.

Thanks for your help, everyone.


Solution

  • There are a few things that are worth paying attention to in the described incident.

    1.During any work, avoid using the root account unless you absolutely need to use it. If you make a mistake you can destroy anything in the system. Scripts running with root privileges can also do any destruction on the system in case of a mistake. If you were not to use cron as root despite the disks error, rsync would not have overwrite permissions, nothing would happen.

    You can use the command crontab -e which in your favorite editor will allow you to set any script and its run time. You also have more control over the system because you can keep your cron scripts in a directory, e.g. ~/crons if you are consistent with all services, it is enough to backup system settings and then backup only your home directory.

    The important difference is that you have different system variables available for root and different for user. Explicit attribution is good practice anything you need at the beginning of the script, don't rely on system variables. After running the script, check whether it works flawlessly not only on your settings but also running from cron.

    The same issue is related to the use of redirects, echoes, etc. to edit system files. Bash is full of various exceptions, unusual substitutions, etc. if you are not absolutely sure what you are doing you can hurt yourself a lot. For this reason, before issuing any complex command, it is necessary to check what it really does (echo before command) and after execution whether the result is actual as planned. It also makes sense to check the output status of the previous command (echo $?)

    It's much better to use an editor like sudo vi /etc/fstab Before you change even one character in this file, use the RCS system (apt install rcs) like this:

    sudo ci -l /etc/fstab
    

    this way you will back up the file: fstab,v if you do this before each change you will always have all previous versions of this file and all the system versions you change, the commands are simple but read the manual for rcs, ci and co, especially remember about the -l option

    2.You cannot rely on mounting disks with /dev/sda /dev/sdb etc. as these may depend on the order of their registration in the system and this may be different, especially for connections via USB, which may introduce additional delays. The solution to this problem is to use the UUID to mount the disks in /etc/fstab e.g .:

    UUID=41c22818-fb56-4da6-8196-c816df0b7aa8 /media/yourname/backup ext3 defaults 0 1
    

    read the fstab manual, note the meaning of the trailing digits, and check that you have write access to the mount directory.

    3.there is one more very important cautionary note. Mounting drives via USB is risky and mounting a backup disk via USB is very risky. The equipment is unreliable, and you should wonder not "if" but "when" will fail. When I am tempted to plug in a USB drive, I look at left on a disk filled with all zeros after data synchronization the driver of the disk controller has been corrupted. The original data has also disappeared - syncronized with the zeros, fortunately I had a tape backup of this data.

    To prevent similar types of failures, start a minicomputer for a few $ with an internal disk for which you will dump data only via a cable network. It can even be a regular cross cable if you don't have a router. Turn on this server only during backup and don't use it for anything else. Of course, all this can be automated, even remotely turn on the server, as long as it has "wake on lan".

    Using only one backup directory and synchronizing it is a bad idea. If the data is important, you need a backup plan and their rotation. It is also worth checking whether the backup is actually correct and can be restored. The fact that the backup has been successfully made is only half the battle.