Search code examples
gitversion-control

does git pull atomically write files


I can't find anything in the documentation. If I do a git pull, am I guaranteed that the underlying file, resulting of the merge, is atomically written?

Some more context about what I am trying to achieve: I have some scripts that periodically do a git pull and I need to know if I can rely on the state of the files being valid during a pull.

We are basically using git as a deployment tool. We never have merge conflicts by design. On the remote end, a job constantly pulls every x seconds, and other jobs read the files. What could happen is that we open a file while it's being pulled by git, and the contents of the file are not what we are expecting. This is unless git is smart enough to use some atomic swap on the underlying OS (RedHat in this case)


Solution

  • The short answer is no.

    It's worth considering that git pull isn't about files at all, it's about commits. Files are just a side effect. :-) The pull operation is just git fetch (obtain commits) followed by a second Git command, usually git merge. The merge step merges commits. This has a side effect of merging files as well, if the operation is not a fast-forward instead of a merge; and then when the merge or fast-forward is complete, Git does a git checkout of the resulting commit.

    So this really boils down to: Is git checkout atomic at the OS level? The answer is a very loud no: it's not atomic in any way. Individual files written in the work-tree are written one at a time, using OS-level write calls, which are not atomic. Files that need to be created or deleted are done one at a time. Git does use the index, which indexes (i.e., keeps tabs on) the work-tree, to minimize the number of files removed, created, or rewritten-in-place. Git also locks against other Git operations, and makes the Git-level transaction appear atomic—but anything working outside Git, that does not cooperate with Git's locking system, will be able to see the changes as they occur.