Is it safe to overwrite a .so file or an executable in use using rsync?

This is not really a programming question. We have a large system written in c++ and uses many shared objects (.so) and native executables on Redhat Enterprise Linux. The system runs on multiple hosts and we use rsync to keep the deployed binaries (shared objects & executables)

If we have to fix a bug in a .so (or executable) we deploy it to a single location and then rsync across all the other hosts

Is it safe to overwrite a .so (or executable) while it is in use (or running)? I have read that rm & cp are safe due to how *nix handles inode (some sort of reference counting). But I couldn't find a satisfactory answer when it comes to rsync

Solution

Short answer

It's perfectly safe within a single file if you don't use --in-place.

It's mostly safe for multiple interdependent files, but has some risks which using --delay-updates will minimize.

Long answer

By default (that is, when not using --in-place), rsync will actually create contents in a new file, named with a temporary name (something like .__your_file), and then rename it over the original file when complete.

This rename is a completely atomic operation: Anything trying to open the file will either get the original file, or the replacement (after that replacement is entirely complete).

Moreover, if the original is in use, then its reference count will be nonzero even after the directory entry pointing to it is overwritten with the new entry pointing to the different inode, so the content will remain on-disk (undeleted) until the original file is no longer open.

However, with multiple files, you run a risk that only some of those files will be atomically replaced. If you're copying over both a new foo and a libfoo.so such that the old foo won't work with the new libfoo.so and the new foo won't work with the old libfoo.so, you're in a bad situation if you're trying to start an executable after the new libfoo.so has been rename()'d into place but foo hasn't yet.

The nearest thing to a fix for this that rsync has available is the --delay-updates option, which will wait until it has both .__foo and .__libfoo.so complete and then rename them both next to each other. There's still no operating-system-level guarantee that you can't see an updated version of one file and not the other, but the time window in which this can occur is made substantially smaller.

If using --in-place, then the operating system will deny write permissions due to the file being in-use (not enforced for all access on UNIX, but specifically enforced with mmap(MAP_PRIVATE), as used for executables and shared libraries); this would be a "Text file busy" error. If your operating system did not enforce this, any scenario where mmap() were used to provide memory regions reflecting file contents (which is typically how shared libraries are loaded) would cause Bad Things to happen in the event of an in-place overwrite.