How is modification order of a variable defined in C++?

I've read this Q&A: What is the significance of 'strongly happens before' compared to '(simply) happens before'?

The author gives an outline of an interesting evaluation that was not possible until C++20 but apparently is possible starting C++20:

.-- T3 y.store(3, seq_cst);                   --.                 (2)
|        |                                      | strongly
|        | sequenced before                     | happens
|        V                                      | before
|   T3 a = x.load(seq_cst); // a = 0    --.   <-'                 (3)
|                                         : coherence-
|                                         : ordered
|                                         : before
|   T1 x.store(1, seq_cst);             <-'   --. --.             (4)
|        |                                      |st |
|        | sequenced before                     |h  |
|        V                                      |b  |
| . T1 y.store(1, release);                   <-'   |             (x)
| |      :                                          | strongly
| |      : synchronizes with                        | happens
| |      V                                          | before
| > T2 b = y.fetch_add(1, seq_cst); // b = 1  --.   |             (1)
|        |                                      |st |
|        | sequenced before                     |h  |
|        V                                      |b  |
'-> T2 c = y.load(relaxed); // c = 3          <-' <-'

The numbers on the right denote a possible seq_cst order (and I added (x) for convenience; this line does not participate in SC order since it's not an SC operation).

I was trying to understand what is the modification order of y in this example, but I don't know how to determine it. (Or are there multiple possible modification orders of y for this evaluation?..)

More generally, how is modification order of an atomic variable defined in C++? For example there's this: https://en.cppreference.com/w/cpp/atomic/memory_order

Write-write coherence: If evaluation A that modifies some atomic M (a write) happens-before evaluation B that modifies M, then A appears earlier than B in the modification order of M

So it seems that modification order has to be consistent with a write-write happens-before one.

Is it the only thing that defines modification order?

In the above example, AFAIU there's no happens-before between (2) and (1); so which one is first in the modification order of y?

Is the mod order of y (x 1 2) (for this evaluation)?

I believe it might also help reasoning about seq_cst order...

Solution

The modification order for an object is the order a thread would see if it was spinning in a tight loop running while(1) { y.load(relaxed); }, and happened to see every every change. (Not that that's a useful way to actually observe it, but every object has its own modification order that all threads can always agree on, like on real CPUs thanks to MESI exclusive ownership being required to commit a store to L1d cache. Or see it early via private store-forwarding between SMT threads on POWER CPUs.)

Some random facts:

The modification order for a single object is compatible with program order within one thread, even with relaxed
After a thread sees a value with a load, it can only see that value or later ones in the modification order, even if the loads and stores are relaxed.
The observed value of a variable can't change more times than there are stores by other threads. If you had a bunch of stores from a bunch of threads, one of them will be last, and that's the value that all readers will see (eventually). And while the dust is settling, any given reader won't see the value changing back and forth, other than actually seeing later stores in the mod order. (See the "coherency" rules in [intro.races] in the standard)

I think this evaluation is showing actual effective order, so the mod order for y is just reading top to bottom, the values stored by operations (2) (x) (1) in that order. (Because it's using enough seq_cst operations that all threads can agree on the order, and showing that some other things end up ordering the release store (x) after the seq_cst store (2).)

~~This eval order is saying that the (2) store did become visible before the (x) store, so the (x) store replaced it.~~

~~And the dust has settled on that before the y.fetch_add (1), otherwise it would have synced-with (2) instead of (x).~~

Correction, apparently they're showing that a modification order of (x) (1) (2) is legal, despite the chain of different flavours of happens-before between (2) (storing 3) and the final y.load(). So that store becomes visible to T2 after the fetch_add, before the load.