Search code examples
c++c++11move-semanticsstdmove

Should I return an rvalue reference parameter by rvalue reference?


I have a function which modifies std::string& lvalue references in-place, returning a reference to the input parameter:

std::string& transform(std::string& input)
{
    // transform the input string
    ...

    return input;
}

I have a helper function which allows the same inline transformations to be performed on rvalue references:

std::string&& transform(std::string&& input)
{
    return std::move(transform(input)); // calls the lvalue reference version
}

Notice that it returns an rvalue reference.

I have read several questions on SO relating to returning rvalue references (here and here for example), and have come to the conclusion that this is bad practice.

From what I have read, it seems the consensus is that since return values are rvalues, plus taking into account the RVO, just returning by value would be as efficient:

std::string transform(std::string&& input)
{
    return transform(input); // calls the lvalue reference version
}

However, I have also read that returning function parameters prevents the RVO optimisation (for example here and here)

This leads me to believe a copy would happen from the std::string& return value of the lvalue reference version of transform(...) into the std::string return value.

Is that correct?

Is it better to keep my std::string&& transform(...) version?


Solution

  • There's no right answer, but returning by value is safer.

    I have read several questions on SO relating to returning rvalue references, and have come to the conclusion that this is bad practice.

    Returning a reference to a parameter foists a contract upon the caller that either

    1. The parameter cannot be a temporary (which is just what rvalue references represent), or
    2. The return value won't be retained past the the next semicolon in the caller's context (when temporaries get destroyed).

    If the caller passes a temporary and tries to save the result, they get a dangling reference.

    From what I have read, it seems the consensus is that since return values are rvalues, plus taking into account the RVO, just returning by value would be as efficient:

    Returning by value adds a move-construction operation. The cost of this is usually proportional to the size of the object. Whereas returning by reference only requires the machine to ensure that one address is in a register, returning by value requires zeroing a couple pointers in the parameter std::string and putting their values in a new std::string to be returned.

    It's cheap, but nonzero.

    The direction currently taken by the standard library is, somewhat surprisingly, to be fast and unsafe and return the reference. (The only function I know that actually does this is std::get from <tuple>.) As it happens, I've presented a proposal to the C++ core language committee toward the resolution of this issue, a revision is in the works, and just today I've started investigating implementation. But it's complicated, and not a sure thing.

    std::string transform(std::string&& input)
    {
        return transform(input); // calls the lvalue reference version
    }
    

    The compiler won't generate a move here. If input weren't a reference at all, and you did return input; it would, but it has no reason to believe that transform will return input just because it was a parameter, and it won't deduce ownership from rvalue reference type anyway. (See C++14 §12.8/31-32.)

    You need to do:

    return std::move( transform( input ) );
    

    or equivalently

    transform( input );
    return std::move( input );