Just as shown in the title, I was writing a static library with a class template and several non-template operator overloads. The class template is defined in a.h, and the functions was defined in a.cc.
So I decide to go on to ask whether RVO/NRVO can benefit the user code of the library?
Edit: I'm sorry about that, it was another question that I just asked and shouldn't be put into this question. To make the scenior more clear, I was actually was trying to encapsulate types like uint8_t and so on, and planed to write some large interger type myself.
When a function returns a prvalue, that means (pre-C++17) that the return value is a temporary object. So, two things happen to this temporary object:
By the standard, if you do return Type(...);
, the return expression is evaluated, resulting in a temporary, and that temporary is used to copy-initialize the return value object.
However, the standard says that this copy initialization doesn't have to happen if Type
is the same type as the return prvalue object. In that case, compiler will simply apply the initialization of the temporary to the return value object directly.
By the standard, if you do Type var = some_func(...)
, and some_func
returns a value, then the temporary returned by some_func
will be used to copy-initialize var
.
However, the standard says that the copy-initialization doesn't have to happen if Type
is the type of value that some_func
returns. Thus, the return value object initialized by some_func
is var
itself. Not some temporary object that initializes var
; some_func
initializes var
directly.
Both of these processes are wholly independent of each other.
The elision of the initialization of the return value is based on the implementation of the function. That implementation does not care what the caller is doing. It simply initializes the return value object directly rather than doing a copy from the return expression.
The elision of the initialization of a variable from the return value of a function does not care about the implementation of the function. It is based solely on the fact that the function returns a prvalue, and that the type of that prvalue is the same as the type of the object being initialized by it. It only needs to see the function's declaration to do its part of eliding the operation.
When both of these happen, you have complete elision of all copies from within the function to its eventual destination. But neither requires the other to exist.
So, consider the following:
Type foo()
{
Type t;
return t;
}
T t2 = foo();
By the standard, this is two copy initializations. First, the return value of foo
is initalized by moving from t
. Second, t2
is initialized by moving from the return value of foo
.
If the compiler can elide both of these, then you get 0 moves. If the compiler can elide the initialization of t2
but not perform NRVO on t
, then you get 1 move`. If it can't do either, then you get 2 moves (and you should immediately stop using that compiler;) ).
If you want more of the implementation details, then it has to do with function calling conventions and the ABI.
The storage for a function's arguments and return value are allocated by the caller. So, the caller sees that the function will return a prvalue, so it allocates enough storage of the appropriate alignment to store that value, then calls the function with a pointer to that storage. The function's implementation will use that storage when it is initializing the return value.
Elision, on the side of the function implementation, is merely constructing the object directly in the return value memory. Elision, on the side of the function using the return value, is simply passing in the storage for the object it will be used with. The t2
example above will perform elision by passing the storage for t2
as the return value storage given to foo
.
The compiler of foo
does not need to know or care if the return value storage is a named value or a temporary. All it knows is that it has been given storage that the return value will be constructed in.
And the compiler of the caller of foo
only needs the function signature, since that tells it everything it needs to know to be able to perform this kind of elision.