Search code examples
performanceprofilingmipscpuperf

How to calculate MIPS using perf stat


Following answer about Benchmarking - How to count number of instructions sent to CPU to find consumed MIPS suggest that:

perf stat ./my_program on Linux will use CPU performance counters to record how many instructions it ran, and how many core clock cycles it took. (And how much CPU time it used, and will calculate MIPS for you).


An example generates following output which does not contain calculated MIPS information.

 Performance counter stats for './hello.py':

       1452.607792 task-clock (msec)         #    0.997 CPUs utilized
               327 context-switches          #    0.225 K/sec
               147 cpu-migrations            #    0.101 K/sec
            35,548 page-faults               #    0.024 M/sec
     2,254,593,107 cycles                    #    1.552 GHz                     [26.64%]
   <not supported> stalled-cycles-frontend
   <not supported> stalled-cycles-backend
     1,652,281,933 instructions              #    0.73  insns per cycle         [38.87%]
       353,431,039 branches                  #  243.308 M/sec                   [37.95%]
        18,536,723 branch-misses             #    5.24% of all branches         [38.06%]
       612,338,241 L1-dcache-loads           #  421.544 M/sec                   [25.93%]
        41,746,028 L1-dcache-load-misses     #    6.82% of all L1-dcache hits   [25.71%]
        25,531,328 LLC-loads                 #   17.576 M/sec                   [26.39%]
         1,846,241 LLC-load-misses           #    7.23% of all LL-cache hits    [26.26%]

       1.456531157 seconds time elapsed

[Q] How could I calculate MIPS correctly from output of perf stat? In order to calculate MIPS should I do following instructions/seconds_time_elapsed from the values obtained from perf stat?


Solution

  • It's obviously just instructions / seconds. (divided by 1 million to scale for the Mega metric prefix.)

    Using the total elapsed time will give you MIPS for the whole program, total across all cores, and counting any time spent sleeping / waiting against it.

    Task-clock will count total CPU time used on all cores, so it will give you the average MIPS across all cores used, not counting any time spent sleeping. (task-clock:u would count only user-space time, but task-clock counts time spent in the kernel as well.)