Search code examples
linuxmultithreadingdeadlock

How to come out of a deadlock in linux


On a multi threading system, if two threads want to work on a shared memory after locking a mutex.

Thread A:

pthread_mutex_lock(&mutex)  
.......   //Memory corruption or Assert and thread exits  
pthread_mutex_unlock(&mutex)  

Thread B:

pthread_mutex_lock(&mutex)  
.......  
pthread_mutex_unlock(&mutex)  

If Thread A acquires the mutex first and exits due to memory corruption or assert, Thread B will be waiting forever causing a deadlock.

  1. Is there a way/method i can use to come out from this kind of deadlock, once it happened?
  2. Is there any other safer method similar to mutex that I can use?

Solution

  • You can set the ROBUST attribute on a mutex. With a robust mutex, if the thread that acquired it exits for some reason without unlocking it, the mutex enters a special state where the next thread that attempts to lock it will get EOWNERDEAD.

    It is then the responsibility of that thread to cleanup any inconsistent state. If recovery is possible, the thread shall call pthread_mutex_consistent(3) any time before pthread_mutex_unlock(3), so that the other threads can use it as before. If recovery is not possible, the mutex should be unlocked without calling pthread_mutex_consistent(3), causing it to enter an unusable state where the only permissible operation is to destroy it.

    Note that the mutex is locked even if EOWNERDEAD was returned (I think it's the only condition under which pthread_mutex_lock(3) returns with an error but locks the mutex).

    To set the ROBUST attribute, use pthread_mutexattr_setrobust(3) after initializing the mutex attributes instance. Remember that this must be done before initializing the mutex. So, something like:

    pthread_mutex_t mutex;
    pthread_mutexattr_t mutex_attrs;
    
    if (pthread_mutexattr_init(&mutex_attrs) != 0) {
        /* Handle error... */
    }
    if (pthread_mutexattr_setrobust(&mutex_attrs, PTHREAD_MUTEX_ROBUST) != 0) {
        /* Handle error... */
    }
    if (pthread_mutex_init(&mutex, &mutex_attrs) != 0) {
        /* Handle error... */
    }
    

    Then you can use it like:

    int lock_res = pthread_mutex_lock(&mutex);
    
    if (lock_res == EOWNERDEAD) {
        /* Someone died before unlocking the mutex
         * We assume there's no cleanup to do
         */
        if (pthread_mutex_consistent(&mutex) != 0) {
            /* Handle error... */
        }
    } else if (lock_res != 0) {
        /* Some other error, handle it here */
    }
    
    /* mutex is locked here, do stuff... */
    
    if (pthread_mutex_unlock(&mutex) != 0) {
        /* Handle error */
    }
    

    For more info you can see the manpage for pthread_mutex_consistent(3) and pthread_mutex_getrobust(3) / pthread_mutex_setrobust(3)