Consider the following C++23 program:
#include <atomic>
#include <cassert>
#include <thread>
#define LOAD_ORDER std::memory_order_seq_cst
#define STORE_ORDER std::memory_order_seq_cst
std::atomic_int v[2]{0, 0};
int r[2];
void
f(int i)
{
v[i].store(1, STORE_ORDER);
r[i] = v[1-i].load(LOAD_ORDER);
}
int
main()
{
{
std::jthread t0(f, 0);
std::jthread t1(f, 1);
}
assert(r[0] || r[1]);
return 0;
}
One thread stores 1 to v[0]
then loads v[1]
. Concurrently, the other thread stores 1 to v[1]
then loads v[0]
. I want to ensure that at least one load returns 1. My questions:
At a high level, sequentially consistent operations cannot be reordered, so this should work. But how do you prove the above code is correct using only the sequenced before, synchronizes with, inter-thread happens before, and visible side effect concepts in the language spec? For instance, it doesn't seem to be the case that one store happens before the other.
Would the code still be correct if I changed LOAD_ORDER
to std::memory_order_relaxed
and left STORE_ORDER
as std::memory_order_seq_cst
? What about the converse of relaxing STORE_ORDER
but not load?
What's the most relaxed or efficient value that I can set LOAD_ORDER
and STORE_ORDER
to and still guarantee the assertion will not fail?
Assume all the variables are on different cache lines--I just didn't add padding to make the example simpler. Also assume no consume operations, so we can ignore the distinctions between weakly/strongly/plain happens before.
Let's go through a couple of rules and scenarios.
t0-store is sequenced before t0-load, therefore t0-store strongly happens before t0-load; same with t1.
In the absence of sequential consistency, we need to provide some inter-thread happens before relation between t0 and t1. Do any of the rules apply?
Therefore the operations are potentially concurrent.. This is to be expected. Informally, an acquire load can be reordered before a release store and vice versa.
What we do know is that t0-store may be coherence-ordered before t1-load, and/or t1-store before t0-load.
Now let's introduce sequentially consistent operations into the mix. This gives us a single total order S with a set of rules. In particular this:
First, if A and B are
memory_order::seq_cst
operations and A strongly happens before B, then A precedes B in S.
We have established that t0-store strongly happens before t0-load. So t0-store precedes t0-load in S. Same with t1. Therefore, in the single total order, the loads always come after the store by their own thread. You cannot have a load in the single total order before at least one store. So at least one store will always be visible to a load.
The only other rule regarding sequentially consistent operations (not fences) is this:
Second, for every pair of atomic operations A and B on an object M, where A is coherence-ordered before B, the following four conditions are required to be satisfied by S:
- if A and B are both
memory_order::seq_cst
operations, then A precedes B in S
All that means for us is that if a load sees the value of a store, it is ordered after it in the single total order. Not much surprise there.
At this point it is important to note that the rules only apply to sequentially consistent operations. There are no rules that apply between sequentially consistent and other atomic operations. With regard to any other operations, sequentially consistent operations simply act as release stores or acquire loads. The standard is very explicit in warning against this style of mixing.
Note 5:
memory_order::seq_cst
ensures sequential consistency only for a program that is free of data races and uses exclusivelymemory_order::seq_cst
atomic operations. Any use of weaker ordering will invalidate this guarantee unless extreme care is used. In many cases,memory_order::seq_cst
atomic operations are reorderable with respect to other atomic operations performed by the same thread.