Let's say my server program has two threads (T1 and T2) running on separate cores. Both are serving RPCs coming in over the network from a single external client. The following sequence of operations occurs:
foo
is initialized to zerofoo
to 42foo
, write is cached in its core's L1 (not main memory)foo
foo
from its cache or main memory and sees that it is zerofoo
is zero .This violates external consistency.
Can this actually occur, or is there an implicit flush of T1's cache when it performs the I/O of sending the ACK back to the client (step 4)?
On x86 and x64 series, all caches are coherent and you'll get 42 on T2 even if the two threads do not share the same cache unit.
Your thought experiment can be further reduced to 2 cases: the two threads share the same cache unit (multi-core) or do not share (multi-cpu).
When they share a cache unit, both T1 and T2 will use the same cache, therefore they will both see 42 without any synchronization to memory.
In case the caches are not shared (i.e., multi-cpu), the ISA requires that the cache units be synchronized, and that will be transparent to software. Both threads will see 42 at the same address. This synchronization introduces some overhead though, therefore nowadays multi-core design is preferred (beside the reason cache is expensive).