Search code examples
cachingx86cpu-architecturecpu-cachecpuid

Problem with the information displayed by the cpuid command


The information of the llc cache displayed using cpuid command on linux is:

  --- cache 3 ---
      cache type                           = unified cache (3)
      cache level                          = 0x3 (3)
      self-initializing cache level        = true
      fully associative cache              = false
      extra threads sharing this cache     = 0x1f (31)
      extra processor cores on this die    = 0xf (15)
      system coherency line size           = 0x3f (63)
      physical line partitions             = 0x0 (0)
      ways of associativity                = 0x13 (19)
      ways of associativity                = 0x6 (6)
      WBINVD/INVD behavior on lower caches = false
      inclusive to lower caches            = true
      complex cache indexing               = true
      number of sets - 1 (s)               = 24575

Why are there two ways of associativity? And it shows 20 in the /sys/devices/system/cpu/cpu0/cache/index3/number_of_sets file? Is 20 the association degree of LLC? What does the ways of associativity = 0x6 (6) show here? How do I distinguish how many cache sets each slice has? Thank you.

I am using a server. The version is:Linux version 4.15.0-122-generic (buildd@lcy01-amd64-010) (gcc version 5.4.0 20160609 (Ubuntu 5.4.0-6ubuntu1~16.04.12)) #124~16.04.1-Ubuntu SMP.

The cpu information is

Architecture: x86_64
 CPU operating mode: 32-bit, 64-bit
 Byte Order: Little Endian
 CPU(s): 48
 On-line CPU(s) list: 0-47
 Number of threads per core: 2
 Number of audits per seat: 12
 Socket(s): 2
 NUMA nodes: 2
 Vendor ID: GenuineIntel
 CPU series: 6
 Model: 79
 Model name: Intel(R) Xeon(R) CPU E5-2650 v4 @ 2.20GHz
 Step: 1
 CPU MHz: 2500.119
 CPU max MHz: 2900.0000
 CPU min MHz: 1200.0000
 BogoMIPS: 4401.87
 Virtualization: VT-x
 L1d cache: 32K
 L1i cache: 32K
 L2 cache: 256K
 L3 cache: 30720K
 NUMA node0 CPU(s): 0,2,4,6,8,10,12,14,16,18,20,22,24,26,28,30,32,34,36,38,40,42 ,44,46
 NUMA node1 CPU(s): 1,3,5,7,9,11,13,15,17,19,21,23,25,27,29,31,33,35,37,39,41,43 ,45,47

Solution

  • Using other numbers that Linux gave you:

    size = bytes_per_line * sets * associativity
    30720 KiB = 64 * 24576 * associativity
    30720 KiB = 1536 KiB * associativity
    30720 KiB / 1536 KiB = associativity
    20 = associativity
    

    Using information from https://ark.intel.com/content/www/us/en/ark/products/91767/intel-xeon-processor-e5-2650-v4-30m-cache-2-20-ghz.html and https://en.wikichip.org/wiki/intel/microarchitectures/broadwell_(client) ) to check; these sources indicate that each of the 12 cores has 2.5 MiB of (20-way associative) L3 cache connected by a kind of ring bus (giving a total of 30 MiB of L3 cache for a chip).

    Using that as "double checked reality", I'd assume that both "ways of associativity" values that were displayed are wrong; and that the first ("ways of associativity = 19") may be displaying "associativity - 1" (similar to the way they were too lazy to add 1 to "number of sets - 1") without saying so (without saying "ways of associativity - 1 = 19"). I have no idea where the second "ways of associativity = 6" came from (the chip uses "6-way associative" for a shared TLB so maybe it displayed that in the wrong place).

    Note that you have 2 chips (in 2 sockets), and all of the above is "per chip" (it'd be two separate 30 MiB groups of L3 caches).