Search code examples
clinuxpthreadsforkposix

Do pthread mutexes in a single-threaded process need to be re-initialized on fork()?


Preamble

The marked "dupe" above does not answer my question, as it involves the safety of forking with threads in use. I am not generating threads in my code, and am more concerned with the validity of pthread_mutex_t structs and their internals registered with the OS when a fork() occurs. ie: Are the mutexes re-created on fork() within the child process, or do children just have a valid (or invalid) shallow copy of the parent's mutex internals?


Background

I have an audio/hardware handling library that wraps some DSP functions with a simple API using a recursive pthread_mutex_t. The reason it is a recursive mutex is because some API functions call other API functions in turn, and I want to make sure only a single thread ever enters the critical section per instance of the library. So, the code would look like so:

static pthread_mutex_t mutex = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;

void read() {
    pthread_mutex_lock(&mutex);
    // ...
    pthread_mutex_unlock(&mutex);
}

void write() {
    pthread_mutex_lock(&mutex);
    // ...
    pthread_mutex_unlock(&mutex);
}

void toggle() {
    pthread_mutex_lock(&mutex);
    read();
    // ...
    write();
    pthread_mutex_unlock(&mutex);
}

Question

If a user application uses my library, and the application issues the fork() call, does the child process's instance of my library need to have its instance of the mutex re-initialized? I know child processes don't inherit threads, and a specific mutex initialization flag needs to be used if I want the two processes to truly share the mutex (can't recall what the flag is) or I have to make use of mmap IIRC. But is the mutex instance used by the child valid (ie: does fork() duplicate the internal values, but they aren't valid anymore, or is a new mutex initialized with the OS)? I don't want the child and parent process to share a mutex when a fork() occurs, but I want to make sure the client is using a valid mutex handle.

Note: I can guarantee that the mutex will not be locked when the fork() call is issued.

Thank you.


Solution

  • Wrt. POSIX, I agree with caf's answer, but fortunately in Linux the semantics are defined.

    From the Linux man 2 fork man page:

    The child process is created with a single thread—the one that called fork(). The entire virtual address space of the parent is replicated in the child, including the states of mutexes, condition variables, and other pthreads objects; the use of pthread_atfork(3) may be helpful for dealing with problems that this can cause.

    So, in Linux, if the mutexes are unlocked, they do not need to be reinitialized after a fork().


    More importantly, POSIX defines the same for semaphores; that

    Any semaphores that are open in the parent process shall also be open in the child process.

    which means you could replace your recursive mutexes with semaphores; just replace your code with

    #include <semaphore.h>
    
    static sem_t  my_lock;
    
    static void my_lock_init(void) __attribute__((constructor));
    static void my_lock_init(void) {
        sem_init(&my_lock, 0, 1U);
    }
    
    void my_read() {
        sem_wait(&my_lock);
    
        // ...
    
        sem_post(&my_lock);
    }
    
    void my_write() {
        sem_wait(&my_lock);
    
        // ...
    
        sem_post(&my_lock);
    }
    
    void toggle() {
        sem_wait(&my_lock);
    
        // ...
        my_read();
        // ...
        my_write();
        // ...
    
        sem_post(&my_lock);
    }
    

    Reinitializing an already initialized semaphore leads to undefined behaviour. This means that the above works, if you somehow can prove that fork() never occurs between your code doing the first sem_wait(&my_lock) and last sem_post(&my_lock).

    The problem is that it is usually impossible to prove that in a multithreaded program, none of the other threads is executing any of the above functions when another thread does the fork().


    In Linux, with kernel 2.6 or later, and GNU C library version 2.5 or later, pthreads is based on NPTL, and pthread locking primitives are implemented on top of futex() syscall.

    The kernel is only aware of a futex when a thread is blocked or waiting on them. The rest of the time, the futex is just a normal data structure. (The kernel uses the address of the futex to tell them apart; addresses to shared memory are treated specially.)

    This means that when futexes are used, you can re-init your mutex safely after a fork(), as long as your own code does not do the forking while holding the mutex.

    As an example -- and remember, only for Linux 2.6 or later, and GNU C library 2.5 or later:

    #define  _GNU_SOURCE
    #include <pthread.h>
    
    static pthread_mutex_t  my_lock = PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;
    
    static void my_reinit(void)
    {
        my_lock = (pthread_mutex_t)PTHREAD_RECURSIVE_MUTEX_INITIALIZER_NP;
    }
    
    static void my_init(void) __attribute__((constructor (65535)));
    static void my_init(void)
    {
        pthread_atfork(NULL, NULL, my_reinit);
    }
    
    void my_read()
    {
        pthread_mutex_lock(&my_lock);
    
        // ...
    
        pthread_mutex_unlock(&my_lock);
    }
    
    void my_write()
    {
        pthread_mutex_lock(&my_lock);
    
        // ...
    
        pthread_mutex_unlock(&my_lock);
    }
    
    void toggle() 
    {
        pthread_mutex_lock(&my_lock);
    
        // ...
        my_read();
        // ...
        my_write();
        // ...
    
        pthread_mutex_unlock(&my_lock);
    }