Search code examples
c++atomicmemory-barriersstdatomic

Can atomic_thread_fence(acquire) prevent previous loads being reordered after itself?


I understand atomic_thread_fence in C++ is quite different with atomic store/loads, and it is not a good practice to understand them by trying to interpret them into CPU(maybe x86)'s mfence/lfence/sfence.

If I use c.load(memory_order_acquire), no stores/loads after c.load can be reordered before c.load. However, I think there are no restriction to stores/loads before c.load. I mean some stores/loads before c.load can be reordered after it theoretically.

However when it comes to atomic_thread_fence(memory_order_acquire), it involves 3 kinds of objects: the fence, store/loads before this fence and store/loads after this fence.

I think the fence will certainly prevent store/loads after this fence being reordered before itself, just like atomic store/load. But will it prevent store/loads before this fence being reordered after itself?

I think the answer is yes with the following searching:

  1. In preshing's article

    An acquire fence prevents the memory reordering of any read which precedes it in program order with any read or write which follows it in program order.

    So he does not specify a direction.

  2. In modernescpp

    there is an additional guarantee with the acquire memory barrier. No read operation can be moved after the acquire memory barrier.

    So I think he say yes directly?

However, I find no "official" answer in cppreference, it only specifies how fences and atomics interact with each other in different threads.

but I am not sure, so I have this question.


Solution

  • After reading your question more carefully, looks like your modernescpp link is making the same mistake that Preshing debunked in https://preshing.com/20131125/acquire-and-release-fences-dont-work-the-way-youd-expect/ - fences are 2-way barriers, otherwise they'd be useless.

    A relaxed load followed by an acquire fence is at least as strong as an acquire load. Anything in this thread after the acquire fence happens after the load, thus it can synchronize-with a release store (or a release fence + relaxed store) in another thread.

    But will it prevent store/loads before this fence being reordered after itself?

    Stores no, it's only an acquire fence.

    Loads, yes. In terms of a memory model where there is coherent shared cache/memory, and we're limiting local reordering of access to that, an acquire fence blocks LoadLoad and LoadStore reordering. https://preshing.com/20130922/acquire-and-release-fences/

    (This is not the way ISO C++'s formalism defines things. It works in terms of happens-before rules that order things relative to a load that saw a value from a store. In those terms, a relaxed load followed by an acquire fence can create a happens-before relationship with a release-sequence, so later code in this thread sees everything that happened before the store in the other thread.)