In a NUMA multi-CPU architecture, the initial value of a
is 0 and it is in a shared state between CPU-x and CPU-y. At time t0, CPU-x executes a = 1
followed immediately by an smp_wmb
, and then at a later time t1, CPU-y executes b = a
followed immediately by an smp_rmb
. Is b
guaranteed to be equal to 1
after the smp_rmb
?
My understanding is that it is not guaranteed. Although a = 1
is first written to its store buffer, then a read invalidate message is sent to CPU-y and added to CPU-y's invalidate queue, and an acknowledgment is sent back to CPU-x, CPU-y processes the invalidate queue after the smp_rmb
, invalidating its cache and reading from CPU-x. However, at this time, CPU-x might be busy with something else and may not have had the chance to process the acknowledgment and flush the store buffer. Therefore, CPU-y could still read the old value of 0 from memory after the smp_rmb
. Is my understanding correct?
"Yes, that's correct." from the comments of @Peter Cordes under the question.