Search code examples
c++c++11rvalue-reference

Compiler deduction of rvalue-references for variables going out of scope


Why won't the compiler automatically deduce that a variable is about to go out of scope, and therefore let it be considered an rvalue-reference?

Take for example this code:

#include <string>

int foo(std::string && bob);
int foo(const std::string & bob);

int main()
{
    std::string bob("  ");
    return foo(bob);
}

Inspecting the assembly code clearly shows that the const & version of "foo" is called at the end of the function.

Compiler Explorer link here: https://godbolt.org/g/mVi9y6

Edit: To clarify, I'm not looking for suggestions for alternative ways to move the variable. Nor am I trying to understand why the compiler chooses the const& version of foo. Those are things that I understand fine.

I'm interested in knowing of a counter example where the compiler converting the last usage of a variable before it goes out of scope into an rvalue-reference would introduce a serious bug into the resulting code. I'm unable to think of code that breaks if a compiler implements this "optimization".

If there's no code that breaks when the compiler automatically makes the last usage of a variable about to go out of scope an rvalue-reference, then why wouldn't compilers implement that as an optimization?

My assumption is that there is some code that would break where compilers to implement that "optimization", and I'd like to know what that code looks like.

The code that I detail above is an example of code that I believe would benefit from an optimization like this.

The order of evaluation for function arguments, such as operator+(foo(bob), foo(bob)) is implementation defined. As such, code such as

return foo(bob) + foo(std::move(bob));

is dangerous, because the compiler that you're using may evaluate the right hand side of the + operator first. That would result in the string bob potentially being moved from, and leaving it in a valid, but indeterminate state. Subsequently, foo(bob) would be called with the resulting, modified string.

On another implementation, the non-move version might be evaluated first, and the code would behave the way a non-expert would expect.

If we make the assumption that some future version of the c++ standard implements an optimization that allows for the compiler to treat the last usage of a variable as an rvalue reference, then

return foo(bob) + foo(bob);

would work with no surprises (assuming appropriate implementations of foo, anyway).

Such a compiler, no matter what order of evaluation it uses for function arguments, would always evaluate the second (and thus last) usage of bob in this context as an rvalue-reference, whether that was the left hand side, or right hand side of the operator+.


Solution

  • Here's a piece of perfectly valid existing code that would be broken by your change:

    // launch a thread that does the calculation, moving v to the thread, and
    // returns a future for the result
    std::future<Foo> run_some_async_calculation_on_vector(std::pmr::vector<int> v); 
    
    std::future<Foo> run_some_async_calculation() {
        char buffer[2000];
        std::pmr::monotonic_buffer_resource rsrc(buffer, 2000);
        std::pmr::vector<int> vec(&rsrc);
        // fill vec
        return run_some_async_calculation_on_vector(vec);
    }
    

    Move constructing a container always propagates its allocator, but copy constructing one doesn't have to, and polymorphic_allocator is an allocator that doesn't propagate on container copy construction. Instead, it always reverts to the default memory resource.

    This code is safe with copying because run_some_async_calculation_on_vector receives a copy allocated from the default memory resource (which hopefully persists throughout the thread's lifetime), but is completely broken by a move, because then it would have kept rsrc as the memory resource, which will disappear once run_some_async_calculation returns.