Consider the following test program, compiled and run on an implementation that fully implements C2011 atomics and threads.
#include <stdio.h>
#include <stdatomic.h>
#include <threads.h>
#if !ATOMIC_INT_LOCK_FREE
#error "Program behavior is only interesting if atomic_int is lock-free."
#endif
static atomic_int var;
int inc_print(void *unused)
{
atomic_fetch_add(&var, 1);
printf(" %d", atomic_load(&var));
return 0;
}
int main(void)
{
thrd_t thrd;
if (thrd_create(&thrd, inc_print, 0) == thrd_success) {
inc_print(0);
thrd_join(&thrd, 0);
}
putchar('\n');
return 0;
}
I have managed to convince myself that all of the following statements are true:
atomic_load
must observe the increment performed by that same thread, so it cannot read a zero.atomic_load
may or may not observe the increment performed by the other thread. (The other thread might not get scheduled at all until after the atomic_load.) Therefore, it can read either a 1 or a 2.printf
are serialized only against each other. Therefore, if one thread's atomic_load
reads a 1 and the other thread's atomic_load
reads a 2, either 1 2
or 2 1
may be printed.atomic_load
s to observe the increment performed by the other thread, so the output 2 2
is also possible.What I'm not sure of, though: Is it possible for neither of the atomic_load
s to observe the increment performed by the other thread? That is, is the output 1 1
possible?
Also, does relaxing the memory model change anything?
Your conclusions look correct to me.
The default memory_order_seq_cst
guarantees this whole program executes in a sequentially consistent manner, since it's data-race free and doesn't use any non-SC atomics. So the possible results are only interleavings of program-order.
This allows both increments then both loads, but one increment must come after the other, seeing its 1
result and writing a 2
. And the load in that thread must come after it, so at least one thread sees a 2
. The 1 1
result is impossible, the 2 2
result can happen.
Relaxed atomics don't introduce any new possibilities here; we can obtain the same guarantees from the rules for operations on the same atomic variable that apply regardless of the memory_order parameter.
var
exists, and a total of two atomic increments must do a total of +=2
. (Atomic RMWs are guaranteed to read the latest value for this reason, to make sure RMWs on the same object are serialized with each other, not both loading a 0 and writing a 1. That wouldn't be an atomic increment.)load
after a fetch_add
in the same thread must see its result or some later value in the modification order. (In C++ this is the write-read coherence guarantee, and sequenced-before (in the same thread) forcing ordering between two operations on the same atomic object. edits welcome with a link to the equivalent language in the C11 standard.)And yes, the printf
s are ordered independently of the atomic modifications to var
. It effectively locks stdout
. If that works like the rules for C11 mtx_lock
, that's like an acquire
operation on the mutex, so it can be starting to take the lock before the increment or load complete.
Not that that's relevant; it's not required for the 2 1
output to be possible. Even if locking was seq_cst
, you could get to a state where neither printf had started, but all the atomics had finished. With one thread having a 1 and the other a 2 as their temporaries. Then it's just chance which gets to print first.