If thread 1 runs:
this.Field.Flag = false;
...
var oldValue = Interlocked.Exchange(ref this.Field, newValue);
oldValue.Flag = true;
and thread 2 sees oldValue.Flag == true
, is it guaranteed that it also sees this.Field == newValue
even if it doesn't use Interlocked/Volatile to read this.Field
?
i.e. is it guaranteed that the effects of instructions after Interlocked.Exchange are only visible after the effects of Interlocked.Exchange are themselves visible?
No, it's not guaranteed that a reader using plain (non-Volatile
) loads will see this.Field == newValue
after seeing oldValue.Flag == true
. Load reordering is possible.
Interlocked.Exchange is a full barrier tied to the exchange (like x86 lock xchg [mem]
), so yes, it guarantees order of stores on opposite sides becoming globally visible (i.e. committing to coherent L1d cache), preserving that StoreStore ordering. (Along with enforcing StoreLoad, LoadLoad, and LoadStore ordering.)
It appears that Interlocked operations are so strong that at least in practice, current compilers for AArch64 use ldaxr
/stlxr
(like a C++ seq_cst
operation) for the exchange, and then do a dmb ish
full barrier afterward. So later stores are ordered after the store side of the exchange itself, even only weakly ordered ISAs where that costs extra barriers. Your code depends on that, not on operations before vs. after Interlocked.Exchange
; storing this.Field.Flag = false;
was a red herring.
But the reader has to make sure its loads are ordered wrt. each other, which doesn't happen without Volatile.Read
or something stronger. Otherwise the load of this.Field
might hit in cache and read a value from before either store became visible, while the earlier load of oldValue.Flag
might miss in cache and only get the later value, just for an example of one possible mechanism for LoadLoad reordering at run time on a weakly-ordered ISA like AArch64.
Compile-time reordering is also possible in the reader, and the only way for this to go wrong on x86-64, where the hardware memory model only allows StoreLoad reordering. (Program order + a store buffer with store-forwarding.)
In other words, you have release semantics for the writer side (fence
;oldValue.Flag=...
is at least as strong as Volatile.Write
on oldValue.Flag
), but you don't get any guarantees if you don't use an acquire load which will synchronize with it if it sees the value. https://preshing.com/20120913/acquire-and-release-semantics/
BTW, your example seems weird to me. You have var oldValue = Interlocked.Exchange(...)
inside a function being run by one of the threads. So it's a local variable. How does another thread even see it at all? Is it a reference to something other threads can already see? And won't writing a value to oldValue
itself maybe make oldValue.Flag == true
before the assignment? But the initializer value isn't available until after Interlocked.Exchange
returns, so that's probably fine.
I'm just assuming we're talking about stores to two different objects on opposite sides of an Interlocked.Exchange
, and a reader that reads them both without any Volatile.Read
or Interlocked.
operations which are full fences.
(I don't know much C# other than its memory-ordering semantics, which are kind of fun because they're documented in terms of ordering a thread's accesses to cache-coherent shared memory, unlike C++'s formalism which is defined only in terms of creating happens-before relationships and modification orders. C#'s lock-free atomic semantics even seem to have grown out of MS's x86-centric history, like all Interlocked.
RMWs being full barriers, like is unavoidable on x86 but costs extra on other ISAs. Some of the underlying ordering concepts are pretty universal at a hardware level, except for strongly-ordered x86 being a lot simpler, with different languages exposing different abstractions for it.)