Search code examples
c++x86ssecpu-cache

What happens with a non-temporal store if the data is already in cache?


When you use non-temporal stores, e.g. movntq, and the data is already in cache, will the store update the cache instead of writing out to memory? Or will it update the cache line and write it out, evicting it? Or what?

Here's a fun dilemma. Suppose thread A is loading the cache line containing x and y. Thread B writes to x using a NT store. Thread A writes to y. There's a data race here if B's store to x can be in-transit to memory while A's load is happening. If A sees the old value of x, but the write of X already happened, then the later write of y and eventual write back of the cache line will clobber unrelated value x. I assume the processor somehow prevents that from happening? I can't see how anyone could build a reliable system using NT stores if it were allowable behavior.


Solution

  • All of the behaviors you describe are sensible implementations of a non-temporal store. In practice, on modern x86 CPUs, the actual semantics are that there's no effect on the L1 cache but the L2 (and higher-level caches, if any) will not evict a cache line to store the non-temporal fetch results.

    There is no data race because the caches are hardware coherent. This coherence is not effected in any way by the decision to evict a cache line.