caching memory computer-science ram localityofreference

Principle of Locality and Call Instructions

In discussing the principle of locality, my textbook makes the following statements:

Except for branch and call instructions, which constitute only a small fraction of all program instructions, program execution is sequential. Hence, in most cases, the instruction to be fetched immediately follows the last instruction fetched.

As a novice, I find this difficult to believe. All of the code I've encountered is highly populated with call instructions. Indeed, it seems to me that call instructions actually perform the most substantial actions in a program.

I would greatly appreciate it if someone could please elaborate on why this concept is true, despite the substantial role of call instructions in programs.

Solution

I picked a random binary on my computer, the cargo package manager. Then I:

disassembled it with otool -tvV cargo > assembly
got only the instructions: cat assembly | awk '{print $2}' > instructions
Counted each instruction: sort instructions | uniq -c | sort -n > count

I processed the result in Libreoffice Calc into a list of occurrences for each instruction. Here are the ones that make up more than 1% of the program each (these sum up to 86% so there is a significant number of stray operations I discarded for brewity):

| 34.83% | movq    |
| 7.30%  | leaq    |
| 7.00%  | callq   |
| 6.90%  | je      |
| 5.61%  | movl    |
| 4.86%  | cmpq    |
| 3.77%  | testq   |
| 3.11%  | jmp     |
| 2.23%  | jne     |
| 2.17%  | popq    |
| 2.05%  | pushq   |
| 1.69%  | addq    |
| 1.29%  | cmpl    |
| 1.20%  | movabsq |
| 1.18%  | movb    |
| 1.05%  | xorl    |

There's definitely a lot of branching and calling going on here (callq, jmp, je, jne) but also quite a lot of memory operations. The memory operations are comparatively slow and make up a lot of the runtime of the program. movq is only one memory operation and it makes up more than a third of the program!

CPU caches are used to keep recently referenced memory data close to the CPU core, which speed up future memory operations on the same data. They can do this due to the Principle of Locality which states that operations on the same memory often come close in time (temporal locality). So the memory data can be cached because you will probably need it soon again.