Search code examples
c++gccoptimizationlambdallvm-clang

Do C++ compilers perform compile-time optimizations on lambda closures?


Suppose we have the following (nonsensical) code:

const int a = 0;
int c = 0;
for(int b = 0; b < 10000000; b++)
{
    if(a) c++;
    c += 7;
}

Variable 'a' equals zero, so the compiler can deduce on compile time, that the instruction 'if(a) c++;' will never be executed and will optimize it away.

My question: Does the same happen with lambda closures?

Check out another piece of code:

const int a = 0;
function<int()> lambda = [a]()
{
    int c = 0;
    for(int b = 0; b < 10000000; b++)
    {
        if(a) c++;
        c += 7;
    }
    return c;
}

Will the compiler know that 'a' is 0 and will it optimize the lambda?

Even more sophisticated example:

function<int()> generate_lambda(const int a)
{
    return [a]()
    {
        int c = 0;
        for(int b = 0; b < 10000000; b++)
        {
            if(a) c++;
            c += 7;
        }
        return c;
    };
}

function<int()> a_is_zero = generate_lambda(0);
function<int()> a_is_one = generate_lambda(1);

Will the compiler be smart enough to optimize the first lambda when it knows that 'a' is 0 at generation time?

Does gcc or llvm have this kind of optimizations?

I'm asking because I wonder if I should make such optimizations manually when I know that certain assumptions are satisfied on lambda generation time or the compiler will do that for me.


Solution

  • Looking at the assembly generated by gcc5.2 -O2 shows that the optimization does not happen when using std::function:

    #include <functional>
    
    int main()
    {
        const int a = 0;    
        std::function<int()> lambda = [a]()
        {
            int c = 0;
            for(int b = 0; b < 10000000; b++)
            {
                if(a) c++;
                c += 7;
            }
            return c;
        };
    
        return lambda();
    }
    

    compiles to some boilerplate and

        movl    (%rdi), %ecx
        movl    $10000000, %edx
        xorl    %eax, %eax
        .p2align 4,,10
        .p2align 3
    .L3:
        cmpl    $1, %ecx
        sbbl    $-1, %eax
        addl    $7, %eax
        subl    $1, %edx
        jne .L3
        rep; ret
    

    which is the loop you wanted to see optimized away. (Live) But if you actually use a lambda (and not an std::function), the optimization does happen:

    int main()
    {
        const int a = 0;    
        auto lambda = [a]()
        {
            int c = 0;
            for(int b = 0; b < 10000000; b++)
            {
                if(a) c++;
                c += 7;
            }
            return c;
        };
    
        return lambda();
    }
    

    compiles to

    movl    $70000000, %eax
    ret
    

    i.e. the loop was removed completely. (Live)

    Afaik, you can expect a lambda to have zero overhead, but std::function is different and comes with a cost (at least at the current state of the optimizers, although people apparently work on this), even if the code "inside the std::function" would have been optimized. (Take that with a grain of salt and try if in doubt, since this will probably vary between compilers and versions. std::functions overhead can certainly be optimized away.)

    As @MarcGlisse correctly pointed out, clang3.6 performs the desired optimization (equivalent to the second case above) even with std::function. (Live)

    Bonus edit, thanks to @MarkGlisse again: If the function that contains the std::function is not called main, the optimization happening with gcc5.2 is somewhere between gcc+main and clang, i.e. the function gets reduced to return 70000000; plus some extra code. (Live)

    Bonus edit 2, this time mine: If you use -O3, gcc will, (for some reason) as explained in Marco's answer, optimize the std::function to

    cmpl    $1, (%rdi)
    sbbl    %eax, %eax
    andl    $-10000000, %eax
    addl    $80000000, %eax
    ret
    

    and keep the rest as in the not_main case. So I guess at the bottom of the line, one will just have to measure when using std::function.