c++multithreading thread-safety mutex memory-model

Do the release-acquire visibility guarantees of std::mutex apply to only the critical section?

I'm trying to understand these sections under the heading Release-Acquire ordering https://en.cppreference.com/w/cpp/atomic/memory_order

They say regarding atomic load and stores:

If an atomic store in thread A is tagged memory_order_release and an atomic load in thread B from the same variable is tagged memory_order_acquire, all memory writes (non-atomic and relaxed atomic) that happened-before the atomic store from the point of view of thread A, become visible side-effects in thread B. That is, once the atomic load is completed, thread B is guaranteed to see everything thread A wrote to memory.

Then regarding mutexes:

Mutual exclusion locks, such as std::mutex or atomic spinlock, are an example of release-acquire synchronization: when the lock is released by thread A and acquired by thread B, everything that took place in the critical section (before the release) in the context of thread A has to be visible to thread B (after the acquire) which is executing the same critical section.

The first paragraph seems to say that an atomic load and store (with memory_order_release, memory_order_acquire) thread B is guaranteed to see everything thread A wrote. including non-atomic writes.

The second paragraph seems to suggest that a mutex works the same way, except the scope of what is visible to B is limited to whatever was wrapped in the critical section, is that an accurate interpretation? or would every write, even those before the critical section be visible to B?

Solution

I think the reason the cppreference quote about mutexes is written that way is due to the fact that if you're using mutexes for synchronization, all shared variables used for communication should always be accessed inside the critical section.

The 2017 standard says in 4.7.1:

a call that acquires a mutex will perform an acquire operation on the locations comprising the mutex. Correspondingly, a call that releases the same mutex will perform a release operation on those same locations. Informally, performing a release operation on A forces prior side effects on other memory locations to become visible to other threads that later perform a consume or an acquire operation on A.

So everything before the unlock in the previous lock-holding thread happens-before everything after the lock in the next thread to take the lock.

This chains across threads, with each one taking the lock making previous lock-holder's operations visible to later lock-takers as well as its own.

Update: I want to make sure I have a solid post because it is surprisingly hard to find this information on the web. Thanks to @Davis Herring for pointing me in the right direction.

The standard says

in 33.4.3.2.11 and 33.4.3.2.25:

mutex unlock synchronizes with subsequent lock operations that obtain ownership on the same object

(https://en.cppreference.com/w/cpp/thread/mutex/lock, https://en.cppreference.com/w/cpp/thread/mutex/unlock)

in 4.6.16:

Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.

https://en.cppreference.com/w/cpp/language/eval_order

in 4.7.1.9:

An evaluation A inter-thread happens before evaluation B if

4.7.1.9.1) -- A synchronizes-with B, or

4.7.1.9.2) -- A is dependency-ordered before B, or

4.7.1.9.3) -- for some evaluation X

4.7.1.9.3.1) ------ A synchronizes with X and X is sequenced before B, or

4.7.1.9.3.2) ------ A is sequenced before X and X inter-thread happens before B, or

4.7.1.9.3.3) ------ A inter-thread happens before X and X inter-thread happens before B.

https://en.cppreference.com/w/cpp/atomic/memory_order

So a mutex unlock B inter-thread happens before a subsequent lock C by 4.7.1.9.1.
Any evaluation A that happens in program order before the mutex unlock B also inter-thread happens before C by 4.7.1.9.3.2
Therefore after an unlock() guarantees that all previous writes, even those outside the critical section, must be visible to a matching lock().

This conclusion is consistent with the way mutexes are implemented today (and were in the past) in that all program-order previous loads and stores are completed before unlocking. (More accurately, the stores have to be visible before the unlock is visible when observed by a matching lock operation in any thread.) There's no question that this is the accepted definition of release in theory and in practice. (For example https://preshing.com/20120913/acquire-and-release-semantics/). In fact that's why acquire and release have those names when generalized to lock-free atomics, from their origins in creating locks.