The information of the llc
cache displayed using cpuid command
on linux is:
--- cache 3 ---
cache type = unified cache (3)
cache level = 0x3 (3)
self-initializing cache level = true
fully associative cache = false
extra threads sharing this cache = 0x1f (31)
extra processor cores on this die = 0xf (15)
system coherency line size = 0x3f (63)
physical line partitions = 0x0 (0)
ways of associativity = 0x13 (19)
ways of associativity = 0x6 (6)
WBINVD/INVD behavior on lower caches = false
inclusive to lower caches = true
complex cache indexing = true
number of sets - 1 (s) = 24575
Why are there two ways of associativity
? And it shows 20 in the /sys/devices/system/cpu/cpu0/cache/index3/number_of_sets
file? Is 20 the association degree of LLC? What does the ways of associativity = 0x6 (6)
show here? How do I distinguish how many cache sets each slice has? Thank you.
I am using a server. The version is:Linux version 4.15.0-122-generic (buildd@lcy01-amd64-010) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)) #124~16.04.1-Ubuntu SMP.
The cpu information is
Architecture: x86_64
CPU operating mode: 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 48
On-line CPU(s) list: 0-47
Number of threads per core: 2
Number of audits per seat: 12
Socket(s): 2
NUMA nodes: 2
Vendor ID: GenuineIntel
CPU series: 6
Model: 79
Model name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
Step: 1
CPU MHz: 2500.119
CPU max MHz: 2900.0000
CPU min MHz: 1200.0000
BogoMIPS: 4401.87
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 30720K
NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42 ,44,46
NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43 ,45,47
Using other numbers that Linux gave you:
size = bytes_per_line * sets * associativity
30720 KiB = 64 * 24576 * associativity
30720 KiB = 1536 KiB * associativity
30720 KiB / 1536 KiB = associativity
20 = associativity
Using information from https://ark.intel.com/content/www/us/en/ark/products/91767/intel-xeon-processor-e5-2650-v4-30m-cache-2-20-ghz.html and https://en.wikichip.org/wiki/intel/microarchitectures/broadwell_(client) ) to check; these sources indicate that each of the 12 cores has 2.5 MiB of (20-way associative) L3 cache connected by a kind of ring bus (giving a total of 30 MiB of L3 cache for a chip).
Using that as "double checked reality", I'd assume that both "ways of associativity" values that were displayed are wrong; and that the first ("ways of associativity = 19") may be displaying "associativity - 1" (similar to the way they were too lazy to add 1 to "number of sets - 1") without saying so (without saying "ways of associativity - 1 = 19"). I have no idea where the second "ways of associativity = 6" came from (the chip uses "6-way associative" for a shared TLB so maybe it displayed that in the wrong place).
Note that you have 2 chips (in 2 sockets), and all of the above is "per chip" (it'd be two separate 30 MiB groups of L3 caches).