Search code examples
c++inlinecompiler-optimization

Why does the compiler not perform inlining, if the function contains static variables?


I read from the following website, that the compiler may not perform inlining when the function has static variables. What is the reason?

Reference: Inline Functions in C++

Remember, inlining is only a request to the compiler, not a command. Compiler can ignore the request for inlining. Compiler may not perform inlining in such circumstances like:

  1. If a function contains a loop. (for, while, do-while)
  2. If a function contains static variables.
  3. If a function is recursive.
  4. If a function return type is other than void, and the return statement doesn’t exist in function body.
  5. If a function contains switch or goto statement.

Solution

  • Compilers can totally inline functions with loops static variables, switch statements, and even recursive functions.

    Here's an example:

    #include <iostream>
    
    inline int foo(int* a, int n)
    {
        int r = 0;
        static int b;
        for (int i = 0; i < n; i++)
        {
            r += a[i];
        }
        switch (n)
        {
        case 42:
            std::cout << "???\n";
        }
        return r;
    }
    
    inline int foo2(int n)
    {
        return n == 0 ? 0 : 1 + foo2(n - 1);
    }
    
    int main()
    {
        int bar[3];
        for (int i = 0; i < 3; i++)
        {
            std::cin >> bar[i];
        }
        std::cout << foo(bar, 3) << '\n';
        std::cout << foo2(bar[0]) << '\n';
    }
    

    And here's the assembly code which the compiler generated:

    main:
            sub     rsp, 24
            mov     edi, OFFSET FLAT:_ZSt3cin
            lea     rsi, [rsp+4]
            call    std::basic_istream<char, std::char_traits<char> >::operator>>(int&)
            lea     rsi, [rsp+8]
            mov     edi, OFFSET FLAT:_ZSt3cin
            call    std::basic_istream<char, std::char_traits<char> >::operator>>(int&)
            lea     rsi, [rsp+12]
            mov     edi, OFFSET FLAT:_ZSt3cin
            call    std::basic_istream<char, std::char_traits<char> >::operator>>(int&)
            mov     esi, DWORD PTR [rsp+8]
            mov     edi, OFFSET FLAT:_ZSt4cout
            add     esi, DWORD PTR [rsp+4]
            add     esi, DWORD PTR [rsp+12]
            call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
            mov     edx, 1
            lea     rsi, [rsp+3]
            mov     BYTE PTR [rsp+3], 10
            mov     rdi, rax
            call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
            mov     esi, DWORD PTR [rsp+4]
            mov     edi, OFFSET FLAT:_ZSt4cout
            call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
            lea     rsi, [rsp+3]
            mov     edx, 1
            mov     BYTE PTR [rsp+3], 10
            mov     rdi, rax
            call    std::basic_ostream<char, std::char_traits<char> >& std::__ostream_insert<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*, long)
            xor     eax, eax
            add     rsp, 24
            ret
    _GLOBAL__sub_I_main:
            sub     rsp, 8
            mov     edi, OFFSET FLAT:_ZStL8__ioinit
            call    std::ios_base::Init::Init() [complete object constructor]
            mov     edx, OFFSET FLAT:__dso_handle
            mov     esi, OFFSET FLAT:_ZStL8__ioinit
            mov     edi, OFFSET FLAT:_ZNSt8ios_base4InitD1Ev
            add     rsp, 8
            jmp     __cxa_atexit
    

    Notice that in the assembly code, there is no call to foo or foo2in the main function. The addition of the array elements is performed by the instructions mov esi, DWORD PTR [rsp+8], add esi, DWORD PTR [rsp+4], and add esi, DWORD PTR [rsp+12] in the middle of the main function. The article is either wrong or it used "may not" to mean "might not" instead of "isn't allowed to". The latter case would make sense because the compiler is less likely to inline larger and more complex functions.

    Also, as explained in the other answers, compilers can inline functions without the inline keyword. If you remove the inline keyword from the example above, the compiler will still inline the function.