Search code examples
c++c++11atomicmemory-modelcompare-and-swap

Using these atomic operations with memory order relaxed and rel/acq as shown, does this C++ snippet work?


Written in cppmem pseudo code:

int main()                                                                                                                                                                                   
{                                                                                                                                                                                            
  atomic_int n = -1;                                                                                                                                                                         
  atomic_int n2 = 0;                                                                                                                                                                         

  {{{                                                                                                                                                                                        
      {                                                                                                                                                                                      
        n2.store(1, mo_relaxed);                                                                                                                                                             
        if (n.load(mo_relaxed) != -1)                                                                                                                                                        
          n.store(1, mo_release);                                                                                                                                                            
      }                                                                                                                                                                                      
  |||                                                                                                                                                                                        
      {                                                                                                                                                                                      
        n.store(0, mo_release);                                                                                                                                                              
        int expected = 0;                                                                                                                                                                    
        do                                                                                                                                                                                   
        {                                                                                                                                                                                    
          desired = n2.load(mo_relaxed);                                                                                                                                                     
        }                                                                                                                                                                                    
        while (!n.compare_exchange_strong(&expected, desired, mo_acquire));                                                                                                                  
      }                                                                                                                                                                                      
  }}}                                                                                                                                                                                        

  assert(n == 1);                                                                                                                                                                            
}                                                                                                                                                                                            

In words, two atomic variables are initialized as n = -1 and n2 = 0;

Thread 1 first writes 1 to n2 and then to n, provided n wasn't (still) -1.

Thread 2 first writes 0 to n, then loads n2 and assigns n = n2 as long as n wasn't changed since it last read n (or when n is still 0).

After both threads joined n must equal 1 in every possible execution.

This code is part of an open source project of me and related to resetting a streambuf implementation to the start of the buffer lock-free while two threads concurrently read and write to it. This particular part has to do with 'sync-ing' (or, flushing written output).

I designed this and it works when every operation is sequentially consistent (this was brute force tested), but I can't wrap my head around the memory order requirements :/.


Solution

  • This assertion could fire if instructions (and cache updates) are performed in this order:

    • The first threads run all its instructions. So it just change the value of n2 from 0 to 1.
    • Then threads 2 runs. First it changes the value of n from -1 to 0.
    • Then threads 2 loads n2 (in n2.load(mo_relaxed)). At this point there are no synchronizations so any value previously stored in n2 (including the initialization value, see [intro.race]/1) can be loaded. Let's say it loads 0.
    • So threads 2 variable values are n==0 (the last one in the modification order of n),n2==0, expected==0, desired==0 before the compare exchange instruction. Then, the compare exchange succeeds, and stores 0 in n.

    At the end of the execution of the two threads you get n==0 and n2==1.

    With sequential consistency what I described cannot happen because if thread 1 saw n2==1 && n==-1, thread 2 could not see n2==0 && n==0.

    With this algorithm I am sure it is not possible to use any other memory order than sequential consistency.