Search code examples
cachingmultiprocessingx86-64memory-model

Effect of memory ordering instructions on x86/x86_64 multiple sockets


Tried looking for the answer to this question in the Intel 64/IA-32, but couldn't find a definitive answer. Questions is: Do memory ordering instructions, such as SFENCE, have effect on the local processor only, or do they spread to the entire cache coherence domain, such as CPUs on a neighboring socket (in a multi-socket system)?


Solution

  • SFENCE affects the order in which the local CPU's stores become globally visible to other cores on the same and other sockets, or to memory-mapped I/O.

    Other cores can't tell whether you ran SFENCE or not, all they can observe is the order of your memory operations. (i.e. the implementation of sfence is internal to a core and its store queue).

    sfence was introduced in SSE1, with PIII, before the first multi-core CPUs. At that time, the only SMP systems were multi-socket.

    Also note that it only does anything useful with weakly-ordered stores (movnt* or stores to write-combining memory regions). Normal stores have "release" semantics already on x86. Only mfence (and locked instructions) matter for normal memory operations on x86, to prevent StoreLoad reordering.