Search code examples
c++atomicmemory-barrierslock-free

memory order with multiple stores


Consider the example below. Assume that barrier is initialized to 0.

There is one producer thread and two consumer threads that constantly check barrier. If the barrier is set, they decrease runcnt. The producer thread waits for runcnt to reach 0. I am confused about the order of multiple store operations inside the producer.

If the order is like it is written, I think the code would run as expected. But if the barrier store is reordered before runcnt store, it seems the assert check would fail. Am I missing anything? Is there a way to fix this?


extern atomic<int> barrier[2];
atomic_int runcnt{0};

void producer() {
    runcnt.store(2, memory_order_relaxed);
    barrier[0].store(1, memory_order_relaxed);
    barrier[1].store(1, memory_order_relaxed);
    
    while (runcnt.load(memory_order_relaxed)) {
        cpu_pause();
    }
}

void consumer(unsigned index) {
   while (true) {
    if (barrier[index].exchange(false, memory_order_relaxed)) {
      int prev = runcnt.fetch_sub(1, memory_order_relaxed);
      assert(prev > 0);
    }
   }
}

Solution

  • If the order is like it is written - You mean if the operations happen to become visible in an order that would also be allowed by seq_cst? You could of course require that with seq_cst on all the operations.

    I think the minimum in the reader side is for barrier[i].exchange to be acquire.

    And in the writer side, both barrier[i] stores need to be release, or to put one std::atomic_thread_fence(release) right after runcnt.store, so it's between that and either barrier store.

    That makes the exchange synchronize with whichever barrier store it loads, assuming that it loaded a 1 so the if body runs at all.

    runcnt.store(relaxed) ; barrier[0].store(release) ; barrier[1].store(relaxed) would not be sufficient in the C++ memory model or even when compiling for some ISAs: the final relaxed store can pass the release store because it's only a 1-way barrier. This is a key difference between fences and operations: https://preshing.com/20131125/acquire-and-release-fences-dont-work-the-way-youd-expect/ . Even making the middle store seq_cst wouldn't be sufficient, it's still just an operation, not 2-way fence.