Search code examples
performancecpu-cachemeasurementcontext-switchmemcache-stats

simplest tool to measure C program cache hit/miss and cpu time in linux?


I'm writing a small program in C, and I want to measure it's performance.

I want to see how much time do it run in the processor and how many cache hit+misses has it made. Information about context switches and memory usage would be nice to have too.

The program takes less than a second to execute.

I like the information of /proc/[pid]/stat, but I don't know how to see it after the program has died/been killed.

Any ideas?

EDIT: I think Valgrind adds a lot of overhead. That's why I wanted a simple tool, like /proc/[pid]/stat, that is always there.


Solution

  • Use perf:

    perf stat ./yourapp
    

    See the kernel wiki perf tutorial for details. This uses the hardware performance counters of your CPU, so the overhead is very small.

    Example from the wiki:

    perf stat -B dd if=/dev/zero of=/dev/null count=1000000
    
    Performance counter stats for 'dd if=/dev/zero of=/dev/null count=1000000':
    
            5,099 cache-misses             #      0.005 M/sec (scaled from 66.58%)
          235,384 cache-references         #      0.246 M/sec (scaled from 66.56%)
        9,281,660 branch-misses            #      3.858 %     (scaled from 33.50%)
      240,609,766 branches                 #    251.559 M/sec (scaled from 33.66%)
    1,403,561,257 instructions             #      0.679 IPC   (scaled from 50.23%)
    2,066,201,729 cycles                   #   2160.227 M/sec (scaled from 66.67%)
              217 page-faults              #      0.000 M/sec
                3 CPU-migrations           #      0.000 M/sec
               83 context-switches         #      0.000 M/sec
       956.474238 task-clock-msecs         #      0.999 CPUs
    
       0.957617512  seconds time elapsed
    

    No need to load a kernel module manually, on a modern debian system (with the linux-base package) it should just work. With the perf record -a / perf report combo you can also do full-system profiling. Any application or library that has debugging symbols will show up with details in the report.

    For visualization flame graphs seem to work well. (Update 2020: the hotspot UI has flame graphs integrated.)