Similar to my previous question, consider this code
-- Initially --
std::atomic<int> x{0};
std::atomic<int> y{0};
-- Thread 1 --
x.store(1, std::memory_order_release);
-- Thread 2 --
y.store(2, std::memory_order_release);
-- Thread 3 --
int r1 = x.load(std::memory_order_acquire); // x first
int r2 = y.load(std::memory_order_acquire);
-- Thread 4 --
int r3 = y.load(std::memory_order_acquire); // y first
int r4 = x.load(std::memory_order_acquire);
Is the weird outcome r1==1, r2==0
and r3==2, r4==0
possible in this case under the C++11 memory model? What if I were to replace all std::memory_order_acq_rel
by std::memory_order_relaxed
?
On x86 such an outcome seems to be forbidden, see this SO question but I am asking about the C++11 memory-model in general.
Bonus question:
We all agree, that with std::memory_order_seq_cst
the weird outcome would not be allowed in C++11. Now, Herb Sutter said in his famous atomic<>
-weapons talk @ 42:30 that std::memory_order_seq_cst
is just like std::memory_order_acq_rel
but std::memory_order_acquire
-loads may not move before std::memory_order_release
-writes. I cannot see how this additional constraint in the above example would prevent the weird outcome. Can anyone explain?
The updated1 code in the question (with loads of x
and y
swapped in Thread 4) does actually test that all threads agree on a global store order.
Under the C++11 memory model, the outcome r1==1, r2==0, r3==2, r4==0
is allowed and in fact observable on POWER.
On x86 this outcome is not possible, because there "stores are seen in a consistent order by other processors". This outcome is also not allowed in a sequential consistent execution.
Footnote 1: The question originally had both readers read x
then y
. A sequentially consistent execution of that is:
-- Initially --
std::atomic<int> x{0};
std::atomic<int> y{0};
-- Thread 4 --
int r3 = x.load(std::memory_order_acquire);
-- Thread 1 --
x.store(1, std::memory_order_release);
-- Thread 3 --
int r1 = x.load(std::memory_order_acquire);
int r2 = y.load(std::memory_order_acquire);
-- Thread 2 --
y.store(2, std::memory_order_release);
-- Thread 4 --
int r4 = y.load(std::memory_order_acquire);
This results in r1==1, r2==0, r3==0, r4==2
. Hence, this is not a weird outcome at all.
To be able to say that each reader saw a different store order, we need them to read in opposite orders to rule out the last store simply being delayed.