Search code examples
c++performance-estimation

c++ steps as efficiency


This is a question about the principle behind estimating efficiency. In one of my projects I came across this situation: a function gets two positive integers and returns the lowest of the two. I would like to know if this method I usually use, where I calculate in amount of steps, is a somewhat accurate method of estimating efficiency and whether there are others or if I should always simply compare how fast they run.

Function(int a, int b)
{
    int lowest = a - b;                   //3 steps, allocating, assigning and calculating
    lowest = lowest * lowest / lowest;    //3 steps, 2 in calculating, 1 in assigning
    //6 steps total

    return lowest;
}

Function(int a, int b)
{
    int lowest;        //1 step in allocating
    if(a > b){         // 2 steps, 1 in comparing, 1 in picking the outcome
        lowest = b;        // 1 step in assigning
                           // Total 4 steps
    }else{
        lowest = a;        // 1 step in assigning
                           // Total 4 steps
    }
    return lowest;
}

In this case I would choose for function 2 because it seems to have fewer steps.


Solution

  • Counting steps is a way to analyse the asymptotic efficiency of an algorithm. This is a measure of how well an algorithm scales up to larger inputs.

    However, to compare how fast two functions are, for a fixed input size, we really need to look at how quickly they actually execute. Counting steps is at best a rough guide here, because:

    1. the steps that are executed aren't the C++ statements, but the machine code instructions they compile to
    2. even if your C++ statements each compiled to the same number of instructions (which they probably don't), instructions don't all take the same number of clock cycles to execute
    3. even if they did all have the same notional latency, these functions will probably be inlined, which means considering them in isolation isn't that useful. You need to know how they affect the optimised code at each call site

    There are lots of rules of thumb about which operations are likely to be slower than others, but the only way to be sure is to measure, in a setting that reproduces your real use case as closely as possible.


    Note

    In this specific code:

    • version 1 doesn't look like it will give the right result, but ignoring that - it has more steps, but they're mostly integer arithmetic, which is heavily optimised and usually fast
    • version 2 has fewer steps, but one of those steps is a branch (if), which has historically been slow.

      Some architectures allow both branches (the if and the else) to execute simultaneously, which might make it fast again. It could also cause a branch-prediction spill with knock-on effects on other code, slowing something else down.