Search code examples
cdata-structuresthread-safetymutexrace-condition

Thread-safe implementation of is_empty in a queue datastructure


I am trying to create a thread-safe queue, and while the bulk of the operations

 queue *queue_create_empty(void);
 void queue_enqueue(queue *q, const void *value, const size_t value_size);
 void *queue_dequeue(queue *q);
 void queue_destroy(queue *q, void (*freefunc)(void *value));

I am finding one to be particularily elusive,

 bool queue_is_empty(const queue *q);

In particular if the function definition is

 bool queue_is_empty(const queue *q) {
     assert(q);

     pthread_mutex_lock(q->mutex);
     bool ret = q->is_empty;
     pthread_mutex_unlock(q->mutex);

     /* 
      * Here there is a possibility of a race condition, the queue may actually
      * be empty at this point, and therefore the return value is misleading
      */
     return ret;
 }

then there is a possibility of a race-condition. The last comment

Here there is a possibility of a race condition, the queue may actually be empty at this point, and therefore the return value is misleading

describes the issue.

I have defined my datastructure as such,

 typedef struct queue {
     element *head;
     element *tail;

     /* Lock for all read/write operations */
     pthread_mutex_t *mutex;

     /* Condition variable for whether or not the queue is empty */
     /* Negation in variable names is poor practice but it is the */
     /* only solution here considering the conditional variable API */
     /* (signal/broadcast) */
     pthread_cond_t *not_empty;

     /* Flag used to gauge when to wait on the not_empty condition variable */
     bool is_empty;

     /* A flag to set whenever the queue is about to be destroyed */
     bool cancel;
 } queue;

And for the implementation of the other functions I have worked around the inadequate queue_is_empty function by only checking the value of q->is_empty since I have already taken a lock on the structure before checking the value.

The queue_dequeue function serves as an example of where one would want to know if the queue is empty. See the first if-statement.

 void *queue_dequeue(queue *q) {
     assert(q);

     void *value = NULL;
     pthread_mutex_lock(q->mutex);

     /* We have a mutex-lock here, so we can atomically check this flag */
     if (q->is_empty) {
         /* We do not have to check the cancel flag here, if the thread is awoken */
         /* in a destruction context the waiting thread will be awoken, the later */
         /* if statement checks the flag before modification of the queue */

         /* This allows other threads to access the lock, thus signalling this thread */
         /* When this thread is awoken by this wait the queue will not be empty, or */
         /* the queue is about to be destroyed */
         pthread_cond_wait(q->not_empty, q->mutex);
     }

     /* We have a mutex lock again so we may atomically check both flags */
     if (!q->cancel && !q->is_empty) {
         value = q->head->value;
         if (q->head->next) {            
             q->head = q->head->next;
             free(q->head->previous);

             /* Make dereferencing dangling pointers crash the program */
             q->head->previous = NULL;
         } else {
             free(q->head);
             q->head = NULL;
             q->is_empty = true;
         }        
     }

     pthread_mutex_unlock(q->mutex);
     return value; 
 }

How do I expose a thread-safe queue_is_empty function?


Solution

  • You've just discovered the reason why locking, threading and concurrency in general is hard. It's easy to create a few wrapper functions that obtain and release locks and prevent crashes from data corruption, it's actually hard to get locking right when you depend on state from earlier accesses.

    I'd argue that the function queue_is_empty is incorrect because it exists at all. It cannot possibly return a useful value because, as you discovered, the return value is stale before you return it. Since the function cannot return a useful return value it should not exist. And this is where you need to think hard about the API you provide.

    One option is to make the locking the responsibility of the callers. The caller could have something like:

    queue_lock(q);
    if (!queue_is_empty(q))
        do_something(q);
    queue_unlock(q);
    

    You will then run into issues with error handling and returns from functions, lock state changes around functions not part of the queue interface, etc. And as soon as you have more than one of those locks you need to manage lock ordering to prevent deadlocks.

    Another option is to reduce the API to only provide safe operations. Enqueue and dequeue work correctly. Do you actually need is_empty? What is it good for? Is it just to avoid waiting for a queue element when the queue is empty? Can you not solve it with dequeue_without_waiting instead? Etc.