Search code examples
cachingrateperf

Linux perf command for cache references


I want to measure cache miss rate of my code. We can use perf list to show supported the events. My desktop has a Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz processor, the perf list contains cache-refrences, and cache-misses, like this:

  cpu-cycles OR cycles                               [Hardware event]
  stalled-cycles-frontend OR idle-cycles-frontend    [Hardware event]
  stalled-cycles-backend OR idle-cycles-backend      [Hardware event]
  instructions                                       [Hardware event]
  cache-references                                   [Hardware event]
  cache-misses                                       [Hardware event]

I think cache-misses is mapped to hardware event LLC-misses according to the Intel architectures software developer's manual (I confirm this by comparing perf stat -e r412e and perf stat -e cache-misses, they given almost identical result). But how is cache-references counted? I didn't find a event or way to get total cache references using existing hardware events. So I'm wondering if this cache-references is accurate on my computer?


Solution

  • On Intel, I don't think perf is providing an event to measure total cache references because such event doesn't exist at hardware level. You should be able to compute this information yourself using hardware cache event reported by perf list:

    L1-dcache-loads                                    [Hardware cache event]
    L1-dcache-load-misses                              [Hardware cache event]
    L1-dcache-stores                                   [Hardware cache event]
    L1-dcache-store-misses                             [Hardware cache event]
    L1-dcache-prefetches                               [Hardware cache event]
    L1-dcache-prefetch-misses                          [Hardware cache event]
    L1-icache-loads                                    [Hardware cache event]
    L1-icache-load-misses                              [Hardware cache event]
    L1-icache-prefetches                               [Hardware cache event]
    L1-icache-prefetch-misses                          [Hardware cache event]
    LLC-loads                                          [Hardware cache event]
    LLC-load-misses                                    [Hardware cache event]
    LLC-stores                                         [Hardware cache event]
    LLC-store-misses                                   [Hardware cache event]
    LLC-prefetches                                     [Hardware cache event]
    LLC-prefetch-misses                                [Hardware cache event]
    

    Events not tagged with -misses represent the number of references in the associated cache.

    Note: this previous question and this man page about perf_event_open (used internally by perf) may help.