Search code examples
pythonlinuxgitbackground-process

git post-receive hook not running in background


According to the git documentation the post-receive hook essentially blocks the repo until it is completed:

... the client doesn’t disconnect until it has completed, so be careful if you try to do anything that may take a long time.

This causes a problem if you need the hook to kick off a build job and then poll for it's completion before kicking off another, say deploy, job. For example, the build server cannot fetch from the repo while said script is running.

Let's also assume that you have absolutely no ability to place your script on the git server to be executed as a shell command with the whole nohup /usr/bin/env python /path/to/post_receive.py 2>&1 > /dev/null & approach similar to this question.

Let's also assume that you have tried the whole double os.fork()'ing daemon process similar to this and a few other questions (non-working sample code below) and found that git still waits for the long-running child to finish before completing the hook.

pid = os.fork()
if pid == 0:
    os.setsid()
    pid = os.fork()
    if pid == 0:
        long_running_post_receive_function()
    else:
        os._exit(0)
else:
    for fd in range(0, 3):
        os.close(fd)
    os._exit(0)

So, with these constraints, has anyone been successful with a long running python post-receive hook that actually runs in the background without blocking the repo?

EDIT

working minimal structure with no exception handling... thanks to @torek and @jthill

pid = os.fork()
if pid == 0:
    os.setsid()
    pid = os.fork()
    if pid == 0:
        for fd in range(0, 3):
            os.close(fd)
        long_running_post_receive_function()
    else:
        os._exit(0)
else:
    sys.exit()

Solution

  • You need to close down all descriptor access, so that ssh knows it will never get any more data. In other words, call os.close on descriptors 0 through 2. In practice you need those to be open though, so it's better to open os.devnull and os.dup2 the resulting descriptor over 0, 1, and 2 (for truly robust software make sure os.open doesn't already return a value 0 <= fd <= 2, of course—if it does, that's OK, just keep it in place while dup2-ing the rest).

    (You still also need the usual double-fork trick, and it may be wise to ditch session IDs and so on. In some Unix-derived systems there is a library routine called daemon, which may be in libc or libutil, that does all this for you. Some details are inevitably OS-dependent, such as the way to give up the controlling terminal if any. However, the main thing missing from your linked Python-specific answer is the replacement of the stdin/stdout/stderr descriptors.)