Search code examples
cconcurrencylockingc99

A tested implementation of Peterson Lock algorithm?


I am seeking a good/correct implementation of Peterson's Lock algorithm in C? I can't seem to find this.


Solution

  • I won't make any assertions about how good or correct the implementation is, but it was tested (briefly). This is a straight translation of the algorithm described on wikipedia.

    struct petersonslock_t {
        volatile unsigned flag[2];
        volatile unsigned turn;
    };
    typedef struct petersonslock_t petersonslock_t;
    
    petersonslock_t petersonslock () {
        petersonslock_t l = { { 0U, 0U }, ~0U };
        return l;
    }
    
    void petersonslock_lock (petersonslock_t *l, int p) {
        assert(p == 0 || p == 1);
        l->flag[p] = 1;
        l->turn = !p;
        while (l->flag[!p] && (l->turn == !p)) {}
    };
    
    void petersonslock_unlock (petersonslock_t *l, int p) {
        assert(p == 0 || p == 1);
        l->flag[p] = 0;
    };
    

    Greg points out that on an SMP architecture with slightly relaxed memory coherency (such as x86), although the loads to the same memory location are in order, loads to different locations on one processor may appear out of order to the other processor.

    Jens Gustedt and ninjalj recommend modifying the original algorithm to use the atomic_flag type. This means setting the flags and turns would use the atomic_flag_test_and_set and clearing them would use atomic_flag_clear from C11. Alternatively, a memory barrier could be imposed between updates to flag.

    Edit: I originally attempted to correct for this by writing to the same memory location for all the states. ninjalj pointed out that the bitwise operations turned the state operations into RMW rather than load and stores of the original algorithm. So, atomic bitwise operations are required. C11 provides such operators, as does GCC with built-ins. The algorithm below uses GCC built-ins, but wrapped in macros so that it can easily be changed to some other implementation. However, modifying the original algorithm above is the preferred solution.

    struct petersonslock_t {
        volatile unsigned state;
    };
    typedef struct petersonslock_t petersonslock_t;
    
    #define ATOMIC_OR(x,v)   __sync_or_and_fetch(&x, v)
    #define ATOMIC_AND(x,v)  __sync_and_and_fetch(&x, v)
    
    petersonslock_t petersonslock () {
        petersonslock_t l = { 0x000000U };
        return l;
    }
    
    void petersonslock_lock (petersonslock_t *l, int p) {
        assert(p == 0 || p == 1);
        unsigned mask = (p == 0) ? 0xFF0000 : 0x00FF00;
        ATOMIC_OR(l->state, (p == 0) ? 0x000100 : 0x010000);
        (p == 0) ? ATOMIC_OR(l->state, 0x000001) : ATOMIC_AND(l->state, 0xFFFF00);
        while ((l->state & mask) && (l->state & 0x0000FF) == !p) {}
    };
    
    void petersonslock_unlock (petersonslock_t *l, int p) {
        assert(p == 0 || p == 1);
        ATOMIC_AND(l->state, (p == 0) ? 0xFF00FF : 0x00FFFF);
    };