Following answer about Benchmarking - How to count number of instructions sent to CPU to find consumed MIPS suggest that:
perf stat ./my_program
on Linux will use CPU performance counters to record how many instructions it ran, and how many core clock cycles it took. (And how much CPU time it used, and will calculate MIPS for you).
An example generates following output which does not contain calculated MIPS
information.
Performance counter stats for './hello.py':
1452.607792 task-clock (msec) # 0.997 CPUs utilized
327 context-switches # 0.225 K/sec
147 cpu-migrations # 0.101 K/sec
35,548 page-faults # 0.024 M/sec
2,254,593,107 cycles # 1.552 GHz [26.64%]
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
1,652,281,933 instructions # 0.73 insns per cycle [38.87%]
353,431,039 branches # 243.308 M/sec [37.95%]
18,536,723 branch-misses # 5.24% of all branches [38.06%]
612,338,241 L1-dcache-loads # 421.544 M/sec [25.93%]
41,746,028 L1-dcache-load-misses # 6.82% of all L1-dcache hits [25.71%]
25,531,328 LLC-loads # 17.576 M/sec [26.39%]
1,846,241 LLC-load-misses # 7.23% of all LL-cache hits [26.26%]
1.456531157 seconds time elapsed
[Q] How could I calculate MIPS
correctly from output of perf stat
? In order to calculate MIPS should I do following instructions/seconds_time_elapsed
from the values obtained from perf stat
?
It's obviously just instructions / seconds. (divided by 1 million to scale for the Mega metric prefix.)
Using the total elapsed time will give you MIPS for the whole program, total across all cores, and counting any time spent sleeping / waiting against it.
Task-clock will count total CPU time used on all cores, so it will give you the average MIPS across all cores used, not counting any time spent sleeping. (task-clock:u
would count only user-space time, but task-clock
counts time spent in the kernel as well.)