Search code examples
c++c++11initializationcopy-elision

initializing a non-copyable member (or other object) in-place from a factory function


A class must have a valid copy or move constructor for any of this syntax to be legal:

C x = factory();
C y( factory() );
C z{ factory() };

In C++03 it was fairly common to rely on copy elision to prevent the compiler from touching the copy constructor. Every class has a valid copy constructor signature regardless of whether a definition exists.

In C++11 a non-copyable type should define C( C const & ) = delete;, rendering any reference to the function invalid regardless of use (same for non-moveable). (C++11 §8.4.3/2). GCC, for one, will complain when trying to return such an object by value. Copy elision ceases to help.

Fortunately, we also have new syntax to express intent instead of relying on a loophole. The factory function can return a braced-init-list to construct the result temporary in-place:

C factory() {
    return { arg1, 2, "arg3" }; // calls C::C( whatever ), no copy
}

Edit: If there's any doubt, this return statement is parsed as follows:

  1. 6.6.3/2: "A return statement with a braced-init-list initializes the object or reference to be returned from the function by copy-list-initialization (8.5.4) from the specified initializer list."
  2. 8.5.4/1: "list-initialization in a copy-initialization context is called copy-list-initialization." ¶3: "if T is a class type, constructors are considered. The applicable constructors are enumerated and the best one is chosen through overload resolution (13.3, 13.3.1.7)."

Do not be misled by the name copy-list-initialization. 8.5:

13: The form of initialization (using parentheses or =) is generally insignificant, but does matter when the initializer or the entity being initialized has a class type; see below. If the entity being initialized does not have class type, the expression-list in a parenthesized initializer shall be a single expression.

14: The initialization that occurs in the form T x = a; as well as in argument passing, function return, throwing an exception (15.1), handling an exception (15.3), and aggregate member initialization (8.5.1) is called copy-initialization.

Both copy-initialization and its alternative, direct-initialization, always defer to list-initialization when the initializer is a braced-init-list. There is no semantic effect in adding the =, which is one reason list-initialization is informally called uniform initialization.

There are differences: direct-initialization may invoke an explicit constructor, unlike copy-initialization. Copy-initialization initializes a temporary and copies it to initialize the object, when converting.

The specification of copy-list-initialization for return { list } statements merely specifies the exact equivalent syntax to be temp T = { list };, where = denotes copy-initialization. It does not immediately imply that a copy constructor is invoked.

-- End edit.


The function result can then be received into an rvalue reference to prevent copying the temporary to a local:

C && x = factory(); // applies to other initialization syntax

The question is, how to initialize a nonstatic member from a factory function returning non-copyable, non-moveable type? The reference trick doesn't work because a reference member doesn't extend the lifetime of a temporary.

Note, I'm not considering aggregate-initialization. This is about defining a constructor.


Solution

  • On your main question:

    The question is, how to initialize a nonstatic member from a factory function returning non-copyable, non-moveable type?

    You don't.

    Your problem is that you are trying to conflate two things: how the return value is generated and how the return value is used at the call site. These two things don't connect to each other. Remember: the definition of a function cannot affect how it is used (in terms of language), since that definition is not necessarily available to the compiler. Therefore, C++ does not allow the way the return value was generated to affect anything (outside of elision, which is an optimization, not a language requirement).

    To put it another way, this:

    C c = {...};
    

    Is different from this:

    C c = [&]() -> C {return {...};}()
    

    You have a function which returns a type by value. It is returning a prvalue expression of type C. If you want to store this value, thus giving it a name, you have exactly two options:

    1. Store it as a const& or &&. This will extend the lifetime of the temporary to the lifetime of the control block. You can't do that with member variables; it can only be done with automatic variables in functions.

    2. Copy/move it into a value. You can do this with a member variable, but it obviously requires the type to be copyable or moveable.

    These are the only options C++ makes available to you if you want to store a prvalue expression. So you can either make the type moveable or return a freshly allocated pointer to memory and store that instead of a value.

    This limitation is a big part of the reason why moving was created in the first place: to be able to pass things by value and avoid expensive copies. The language couldn't be changed to force elision of return values. So instead, they reduced the cost in many cases.