Search code examples
c++c++11overheadpass-by-rvalue-reference

Do rvalue references have the same overhead as lvalue references?


Consider this example:

#include <utility>

// runtime dominated by argument passing
template <class T>
void foo(T t) {}

int main() {
    int i(0);
    foo<int>(i); // fast -- int is scalar type
    foo<int&>(i); // slow -- lvalue reference overhead
    foo<int&&>(std::move(i)); // ???
}

Is foo<int&&>(i) as fast as foo<int>(i), or does it involve pointer overhead like foo<int&>(i)?

EDIT: As suggested, running g++ -S gave me the same 51-line assembly file for foo<int>(i) and foo<int&>(i), but foo<int&&>(std::move(i)) resulted in 71 lines of assembly code (it looks like the difference came from std::move).

EDIT: Thanks to those who recommended g++ -S with different optimization levels -- using -O3 (and making foo noinline) I was able to get output which looks like xaxxon's solution.


Solution

  • In your specific situation, it's likely they are all the same. The resulting code from godbolt with gcc -O3 is https://godbolt.org/g/XQJ3Z4 for:

    #include <utility>
    
    // runtime dominated by argument passing
    template <class T>
    int foo(T t) { return t;}
    
    int main() {
        int i{0};
        volatile int j;
        j = foo<int>(i); // fast -- int is scalar type
        j = foo<int&>(i); // slow -- lvalue reference overhead
        j = foo<int&&>(std::move(i)); // ???
    }
    

    is:

        mov     dword ptr [rsp - 4], 0 // foo<int>(i);
        mov     dword ptr [rsp - 4], 0 // foo<int&>(i);
        mov     dword ptr [rsp - 4], 0 // foo<int&&>(std::move(i)); 
        xor     eax, eax
        ret
    

    The volatile int j is so that the compiler cannot optimize away all the code because it would otherwise know that the results of the calls are discarded and the whole program would optimize to nothing.

    HOWEVER, if you force the function to not be inlined, then things change a bit int __attribute__ ((noinline)) foo(T t) { return t;}:

    int foo<int>(int):                           # @int foo<int>(int)
            mov     eax, edi
            ret
    int foo<int&>(int&):                          # @int foo<int&>(int&)
            mov     eax, dword ptr [rdi]
            ret
    int foo<int&&>(int&&):                          # @int foo<int&&>(int&&)
            mov     eax, dword ptr [rdi]
            ret
    

    above: https://godbolt.org/g/pbZ1BT

    For questions like these, learn to love https://godbolt.org and https://quick-bench.com/ (quick bench requires you to learn how to properly use google test)