Search code examples
performancex86cpu-architectureperfspeculative-execution

Measure the number of executed instructions including *speculative*


I'd like to measure the number of instructions executed in my program including speculative instructions that didn't retire. I know that linux perf can easily report the retired instruction count with:

$ perf stat --event instructions -- <my_program>

Is there a way to do that? I could not find a suitable performance event under perf list.

Is there another proxy to measure the amount of speculative execution in the processor?

More information: I have an Intel Skylake machine, but answers relevant to other Intel/AMD processors would be great as well.


Solution

  • instructions counts retired instructions. Speculative exec works in terms of uops; the CPU only cares about instruction boundaries at decode and retirement. (And somewhat in the uop cache.)

    An event like uops_executed.thread is probably what you want, vs. uops_retired.retire_slots. But with micro-fusion, add eax, [rdi] is 1 fused-domain uop (issue and retire), but two unfused-domain uops_executed.thread.

    uops_dispatched_port.port_0 / 1 / 5 / 6 for ALU ports, and 2,3,7 load/store-address and port 4 store-data events also exist; uops_executed.thread should mostly(?) be the sum of those per-port counters.

    To count mis-speculation, you could compare uops_issued.any vs. uops_retired.retire_slots. That won't tell you how many of the mis-speculated uops actually got executed before mis-speculation was detected, and I don't know a good way to do that other than careful counting, like knowing how many unfused-domain uops there were in the uops_retired.retire_slots uops that retired. That may be doable in a microbencmark loop, otherwise you'd just have to go by averages in a larger program.

    (The event names I mention all exist on my Skylake-client CPU, and probably earlier and later Intel. AMD will have very different event names.)