Search code examples
c++loopscompilationcompiler-optimization

Will the compiler know to skip this loop?


I am exploring the speed of C++ compared to other languages such as Python. The following code counts to 100 million and prints out the final number (100 million). I'm wondering if the compiler is able to tell that the variable n is used only to be printed, and so it can skip the loop entirely and just print out the final value.

#include <iostream>

int main() {
    size_t n = 0;
    while (n< 100'000'000) {
        n++;

    }
    
    std::cout << n << std::endl;

    return 0;
}

I thought that perhaps the answer lies in looking at the assembly code, but I'm not really sure how this works.


Solution

  • Will the compiler know to skip this loop?

    Depends.

    If you are compiling without optimizations enabled, the compiler does whatever you tell him to and does not optimize the loop away. If you compile with optimizations enabled, the compiler can recognize what you are doing and just skip the entire loop.

    I have compiled your code with gcc at Compiler Explorer.


    Without optimizations (-O0):

    main:
            push    rbp
            mov     rbp, rsp
            sub     rsp, 16
            mov     QWORD PTR [rbp-8], 0
            jmp     .L2
    .L3:
            add     QWORD PTR [rbp-8], 1
    .L2:
            cmp     QWORD PTR [rbp-8], 99999999
            jbe     .L3
            mov     rax, QWORD PTR [rbp-8]
            mov     rsi, rax
            mov     edi, OFFSET FLAT:std::cout
            call    std::basic_ostream<char, std::char_traits<char> >::operator<<(unsigned long)
            mov     esi, OFFSET FLAT:std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
            mov     rdi, rax
            call    std::basic_ostream<char, std::char_traits<char> >::operator<<(std::basic_ostream<char, std::char_traits<char> >& (*)(std::basic_ostream<char, std::char_traits<char> >&))
            mov     eax, 0
            leave
            ret
    

    The instructions from jmp .L2 up to jbe .L3 are the loop you programmed.


    With optimizations (-Os):

    main:
            push    rax
            mov     esi, 100000000
            mov     edi, OFFSET FLAT:std::cout
            call    std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<unsigned long>(unsigned long)
            mov     rdi, rax
            call    std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)
            xor     eax, eax
            pop     rdx
            ret
    

    The mov esi, 100000000 stores the value for n directly without going through the entire loop.


    I have chosen the optimization level -Os here because it should generate the smallest executable, which means less assembly to look at (note that less assembly doesn't necessarily mean faster execution). However the compiler removes the loop already at the lowest optimization level (-O1).

    For an overview about the gcc optimization levels you can take a look at this answer.