Linux daemonize without PID file race condition

I have done work several times on making a program run as a daemon under Linux.

In one case, I've just used daemon().
On another occasion, I've written my own daemon code (based on something like this) because I wanted to do more complicated redirection of STDIN, STDOUT, etc.
I've also used the Busybox start-stop-daemon to start a C# Mono program as a daemon, and also generate a PID file with the -m option.

The problem is, all of these solutions have a race condition on PID file creation, which is to say, the PID file is written by the program by its background process, some indeterminate time after the foreground process has exited. This is a problem e.g. in an embedded Linux if the program is started by an initscript, and then lastly a watchdog process is started which monitors the program is running by checking its PID file. In the C# Mono case using start-stop-daemon, I've had such a system get rebooted occasionally at start-up by the watchdog because the program's PID file hadn't yet been written by the time the watchdog process begins monitoring (surprising as that may be that this would ever happen in a practical scenario).

How can a program be daemonized without a PID file race condition? That is, in such a way to guarantee that the PID file is fully created and written with the valid PID value when the foreground process exits.

Note, this is made a little more difficult with the Linux daemon fork-setsid-fork idiom (to prevent the daemon from acquiring a controlling tty), because the parent can't so easily get the grandchild's PID.

Solution

I'm trying the following code. The essential points are:

The parent of the first fork waits until the child exits.
The child of the first fork does various daemon set up, then does a second fork. The parent of the second fork (which gets the PID of its child) writes the PID into the PID file, then exits.

So with this method, the foreground process doesn't exit until the background process' PID has been written.

(Note the difference between exit() and _exit(). The idea is that exit() does normal shutdown, which can include unlock and deletion of the PID file either by C++ destructor or by C atexit() function. But _exit() skips any of that. That allows the background process to keep the PID file open and locked (using e.g. flock()), which allows for a "singleton" daemon. So the program, before calling this function, should open the PID file and flock() it. If it's a C program, it should register an atexit() function which will close and delete the PID file. If it's a C++ program, it should use a RAII-style class to create the PID file and close/delete it on exit.)

int daemon_with_pid(int pid_fd)
{
    int         fd;
    pid_t       pid;
    pid_t       pid_wait;
    int         stat;
    int         file_bytes;
    char        pidfile_buffer[32];

    pid = fork();
    if (pid < 0) {
        perror("daemon fork");
        exit(20);
    }
    if (pid > 0) {
        /* We are the parent.
         * Wait for child to exit. The child will do a second fork,
         * write the PID of the grandchild to the pidfile, then exit.
         * We wait for this to avoid race condition on pidfile writing.
         * I.e. when we exit, pidfile contents are guaranteed valid. */
        for (;;) {
            pid_wait = waitpid(pid, &stat, 0);
            if (pid_wait == -1 && errno == EINTR)
                continue;
            if (WIFSTOPPED(stat) || WIFCONTINUED(stat))
                continue;
            break;
        }
        if (WIFEXITED(stat)) {
            if (WEXITSTATUS(stat) != 0) {
                fprintf(stderr, "Error in child process\n");
                exit(WEXITSTATUS(stat));
            }
            _exit(0);
        }
        _exit(21);
    }

    /* We are the child. Set up for daemon and then do second fork. */
    /* Set current directory to / */
    chdir("/");

    /* Redirect STDIN, STDOUT, STDERR to /dev/null */
    fd = open("/dev/null", O_RDWR);
    if (fd < 0)
        _exit(22);
    stat = dup2(fd, STDIN_FILENO);
    if (stat < 0)
        _exit(23);
    stat = dup2(fd, STDOUT_FILENO);
    if (stat < 0)
        _exit(23);
    stat = dup2(fd, STDERR_FILENO);
    if (stat < 0)
        _exit(23);

    /* Start a new session for the daemon. */
    setsid();

    /* Do a second fork */
    pid = fork();
    if (pid < 0) {
        _exit(24);
    }
    if (pid > 0) {
        /* We are the parent in this second fork; child of the first fork.
         * Write the PID to the pidfile, then exit. */
        if (pid_fd >= 0) {
            file_bytes = snprintf(pidfile_buffer, sizeof(pidfile_buffer), "%d\n", pid);
            if (file_bytes <= 0)
                _exit(25);
            stat = ftruncate(pid_fd, 0);
            if (stat < 0)
                _exit(26);
            stat = lseek(pid_fd, 0, SEEK_SET);
            if (stat < 0)
                _exit(27);
            stat = write(pid_fd, pidfile_buffer, file_bytes);
            if (stat < file_bytes)
                _exit(28);
        }
        _exit(0);

    }
    /* We are the child of the second fork; grandchild of the first fork. */
    return 0;
}