Search code examples
race-conditionlock-freereference-counting

Overcoming the race condition in lock-free reference-counted dereferences


Imagine a structure like this:

struct my_struct {
    uint32_t refs
    ...
}

for which a pointer is acquired through a lookup table:

struct my_struct** table;

my_struct* my_struct_lookup(const char* name)
{
    my_struct* s = table[hash(name)];

    /* EDIT: Race condition here. */

    atomic_inc(&s->refs);

    return s;
}

A race exists between the dereference and the atomic increment in a multi-threaded model. Given that this is very performance critical code, I was wondering how this race inbetween the dereference and atomic increment is typically resolved or worked around?

EDIT: When acquiring a pointer to a my_struct structure via the lookup table, it is necessary to first dereference the structure in order to increment its reference count. This creates a problem in multi-threaded code when other threads could be altering the reference count and potentially deallocating the object itself while another thread would then dereference a pointer to non-existent memory. Combined with preemption and some bad luck, this could be a recipe for disaster.


Solution

  • As someone said above, you can make linked list of memory to free at some later time, so your pointers are never invalid. This is a handy method in some cases.

    Or....you can make a 64 bit struct with your 32 bit pointer and have 32 bits for a ref count and other flags. You can use 64 bit atomic ops on the struct if you wrap it in a union:

    union my_struct_ref {
        struct { 
          unsigned int       cUse     : 16,
                             fDeleted : 1;    // etc 
          struct my_struct  *s;
        } Data;
        unsigned long n64;
    } 
    

    You can human readably work with the Data part of the struct, and you can use CAS on the n64 bit part.

    my_struct* my_struct_lookup(const char* name)
    {
        struct my_struct_ref Old, New;
        int iHash = hash(name);
    
        // concurrency loop
        while (1) {
          Old.n64 = table[iHash].n64;
          if (Old.Data.fDeleted)
            return NULL;
          New.n64 = Old.n64;
          New.Data.cRef++;
          if (CAS(&table[iHash].n64, Old.n64, New.n64)) // CAS = atomic compare and swap
            return New.Data.s; // success
          // we get here if some other thread changed the count or deleted our pointer
          // in between when we got a copy of it int old.  Just loop to try again.
        } 
    }
    

    If you are using 64 bit pointers you will need to do 128 bit CAS.