Search code examples
cpu-cachecpu-cores

Can extensive usage of L3 cache by one core invalidate L1/L2 cache of another core?


Current Intel CPU cache architecture consists of local L1 and L2 caches and shared inclusive L3 cache. I have two similar questions regarding this:

  1. Can extensive memory access by the thread running on one core invalidate L1/L2 cache of another core?
  2. Can data required by thread running on a single core occupy the whole L3 cache?

UPDATE: Be aware that Intel Skylake has new L3 cache architecture which is non-inclusive.


Solution

  • The answer to both questions is yes.

    The second is simpler, so let's start there - the main benefit of the L3 cache is that it's shared. The purpose of this sharing is to allow you to utilize more cache capacity by a single thread when needed than is would have otherwise get if you were to split the same resources between the cores instead of making them shared.

    In other words, if all N cores are active and balanced, and the data is evenly distributed (i.e. no alignment issues), you would get exactly the same share (1/N) of the LLC per core. However, if one core is more cache hungry - there's room for it to grow at the expense of the others that are currently less cache bound. In the extreme case of course, you can utilize the entire L3 by a single process, disregarding the cases where the architecture decides to preserve some subset of the L3 dedicated to some task (quite common) or some core (much less common).

    As for the first question - if the L3 is inclusive (as is the case in most common CPUs, mostly for efficient snoop filtering), and one of the threads becomes dominant and takes over it entirely, then the data placed there by the other cores (the less active ones) will have to be evicted to memory, and in order to enforce inclusiveness - these lines will also have to be forced out of the respective cores' L1 and L2. If you keep the data there (breaking inclusiveness), you will lose coherency.

    On systems where the L3 is not inclusive, this behavior will not happen, and the less active core will be able to retain their data internally in the L1/L2. However, such systems may employ an inclusive snoop filter, which may suffer from the same problem (and force evictions again) - depending on the exact cache protocol.