Search code examples
c++performanceloopsoptimizationconceptual

Comparison in loop (optimization)


Let's consider the situation:

bool b = checking_some_condition();

for (int i = 0; i < 1000000; ++i)
{
    if (b)
           do_something(i);
    else 
           do_something_else(i);
}

Is it obvious that compiler optimizes the code above into something like this? :

if (b)
{
    for (int i = 0; i < 1000000; ++i)
        do_something(i);
}
else
{
    for (int i = 0; i < 1000000; ++i)
        do_something_else(i);
}

Of course, I am only giving example the present the situation. I know that checking bool value 1000000 times is hardly noticeable for the performace, but if I'd have more complex comparisons with multiple ways of how the code inside loop would go, change in performance could be significant. Especially if this code would be inside the function that is called multiple times.


Solution

  • As was mentioned in the comments above you can't really make a safe assumption what the compiler will optimize or won't. It's their "freedom" to do these things or not.

    If you want to get a feeling for what's going on the best way is to look at the generated assembly which will give you and objective way of arguing what the compiler might have done. https://godbolt.org/z/W-5Hve shows the easy example you posted above.

    However, please try to make the example in godbolt as realistic as possible and then check the assembly. Even if two snippets will yield the same assembly in godbolt to make sure that this will also happen in your codebase you need to check the assembly of your compiled implementation in you codebase as well.

    Summarizing this, what I normally do is:
    - try a realistic example in godbolt and play with different compilers/flags and change the code until I think I know whats going on.
    - compile my project and look at the assembly there to try and find the specific function again to make sure that the result in my code base is the same.

    As a little extra: objdump -M intel -dC executable will show you the assembly of an executable.