Search code examples
cpucpu-architecturecpu-cache

What's the theory and measurements behind cache line sizes?


Cache lines are often 64 bytes, other sizes also exist.

My very simple question is: is there any theory behind this number, or is it just the result of the vast amount of tests and measurements that engineers behind it undoubtedly do?

Either way, I was wondering what those (the theory, if there is one, and kinds of tests behind the decision) are.


Solution

  • In general microarchitectural parameters tend to be tuned via performance modeling rather than some sort of theoretical model. That is to say there isn't anything like "big O" that is used to characterize the performance of algorithms. Instead benchmarks are run using performance simulators and this is used to guide the choice of the best parameters.

    That having been said there are a few reasons why cache line size is going to be fairly stable in an established architecture:

    • Size is a power of 2: The line size should be a power of 2 in order to simplify addressing, so this limits the number of possible choices for cache line size.

    • Software is optimized based on cache parameters: Many microarchitectural parameters are completely hidden from the programmer. But the cache line size is one that is visible, and can have a significant impact on performance for some applications. Once programmers have optimized their code for a 64-byte cache line size then the processor architects have an incentive to keep this same cache line size in future processors, even if the underlying technology changed in a way that made a different size cache line easier to implement in hardware.

    • Cache coherence interacts with cache line: The verification of cache coherence protocols is extremely difficult, and cache coherence is a source of many bugs in processors. Coherence is tracked at the cache line level, so changing the cache line would require redoing all of the validation steps for a coherence protocol. So there would need to be a strong motivation for changing this parameters.

    • Changing cache line size could introduce false sharing: This is a special case of software being optimized based on cache parameters, but I think it is worth mentioning. Parallel programs are difficult to write in a way that actually provides performance benefits. Since data is tracked at the cache line granularity it is important to avoid false sharing. If the cache line size changed from one processor generation to another this could cause false sharing in the new processor that did not exist in the old one.

    Although 64 bytes is the line size used for x86 and most ARM processors, there are other line sizes in use. For instance MIPS has many processors that have a 32 byte line size, and some that have 16 byte line size.

    The line size is tuned to some degree to give the best performance for the workloads that the architecture is expected to run. However, once a line size is selected, and significant amounts of software have been written for the architecture, then the line size is unlikely to change in the future, for the reasons that I listed above.