Search code examples
c++memory-barriersstdatomicmemory-model

Transitivity of release-acquire


Just when I thought I got some grip around atomics, I see another article. This is an excerpt from GCC wiki, under Overall Summary:

 -Thread 1-       -Thread 2-                   -Thread 3-
 y.store (20);    if (x.load() == 10) {        if (y.load() == 10)
 x.store (10);      assert (y.load() == 20)      assert (x.load() == 10)
                    y.store (10)
                  }

Release/acquire mode only requires the two threads involved to be synchronized. This means that synchronized values are not commutative to other threads. The assert in thread 2 must still be true since thread 1 and 2 synchronize with x.load(). Thread 3 is not involved in this synchronization, so when thread 2 and 3 synchronize with y.load(), thread 3's assert can fail. There has been no synchronization between threads 1 and 3, so no value can be assumed for 'x' there.

The article is saying that the assert in thread 2 won't fail, but that in 3 might.

I find that surprising. Here's my chain of reasoning that the thread 3 assert won't fail—perhaps someone can tell me where I'm wrong.

  1. Thread 3 observes y == 10 only if thread 2 wrote 10.
  2. Thread 2 writes 10 only if it saw x == 10.
  3. Thread 2 (or any thread) sees x == 10 only if thread 1 wrote 10. There are no further updates to x from any thread.
  4. Since thread 2 observed x == 10, and thread 3, too, having synchronized with thread 2, should observe x == 10.

Release/acquire mode only requires the two threads involved to be synchronized.

Can someone point to a source for this 2-party-only requirement, please? My understanding (granted, perhaps wrong) is that the producer has no knowledge of with whom it's synchronizing. I.e., thread 1 can't say, "my updates are only for thread 2". Likewise, thread 2 can't say, "give me the updates from thread 1". Instead, a release of x = 10 by thread 1 is for anyone to observe, if they so chose.

Thus, x = 10 being the last update (by thread 1), any acquire from anywhere in the system happened-after (ensured by transitive synchronization) is guaranteed to observe that write, isn't it?

This means that synchronized values are not commutative to other threads.

Regardless of whether it's true, the author perhaps meant transitive, not commutative, right?

Lastly, if I'm wrong above, I'm curious to know what synchronization operation(s) would guarantee that thread 3's assert won't fail.


Solution

  • Seems like you've found a mistake in the GCC wiki.
    The assert in T3 shouldn't fail in C++.
    Here are the explanations with the relevant quotes from the C++20 standard:

    1. x.store (10) in T1 happens before assert (x.load() == 10) in T3, because:
      • statements within every thread are ordered with sequenced before

      9 Every value computation and side effect associated with a full-expression is sequenced before every value computation and side effect associated with the next full-expression to be evaluated.48

      • x.store (10) synchronizes with if (x.load() == 10) and y.store (10) synchronizes with if (y.load() == 10)

      2 An atomic operation A that performs a release operation on an atomic object M synchronizes with an atomic operation B that performs an acquire operation on M and takes its value from any side effect in the release sequence headed by A.

      • as a result x.store (10) inter-thread happens before assert (x.load() == 10)

      9 An evaluation A inter-thread happens before an evaluation B if
      (9.1)   — A synchronizes with B, or
      (9.2)   — A is dependency-ordered before B, or
      (9.3)   — for some evaluation X
      (9.3.1)    — A synchronizes with X and X is sequenced before B, or
      (9.3.2)    — A is sequenced before X and X inter-thread happens before B, or
      (9.3.3)    — A inter-thread happens before X and X inter-thread happens before B.

      • this also means x.store (10) happens before assert (x.load() == 10)

      10 An evaluation A happens before an evaluation B (or, equivalently, B happens after A) if:
      (10.1)   — A is sequenced before B, or
      (10.2)   — A inter-thread happens before B.

    2. the above means that x.load() in assert (x.load() == 10) must return 10 written by x.store (10).
      (We assume here that x was published correctly and therefore the initial value of x comes before x.store (10) in the modification order of x).

      18 If a side effect X on an atomic object M happens before a value computation B of M, then the evaluation B shall take its value from X or from a side effect Y that follows X in the modification order of M.
      [Note 18: This requirement is known as write-read coherence. — end note]