Search code examples
c++initializationlifetimec++23placement-new

Can someone explain the rules of object lifetimes and uninitialized memory in C++ (context: std::inplace_vector)?


I've been trying to implement my own version of std::inplace_vector in C++26; that is, a dynamically-resizable array with compile-time fixed capacity, that also allows non-default-constructible elements to be stored.

At the outset, the only way I could think of having non-default-constructible elements to be stored was to keep a union of a T array[N]; as a field. This is because I previously read that a single-element union can be used to prevent the immediate initialization of that element (source), which is needed here to prevent the automatic default construction of the T values. Then, I would use placement-new/delete (or, well, actually std::allocator_traits<Allocator>) to directly initialize and destruct elements when needed.

However, I just thought of a possible issue. I think I remember hearing that in C++, one is not allowed to assign an object with size N bytes to any uninitialized sequence of N raw bytes, because the object's "lifetime" has to have begun. In that case, if I were to default-construct a my_inplace_vector, wouldn't I be disallowed from actually assigning directly to any of the elements (i.e. my_inplace_vector[1] = some initializer), because wrapping the field array in an union prevents the lifetime of its elements from starting?

I've read the cppreference page on lifetimes, but I cannot find where the relevant section is for objects contained within an array contained within an union, so I am unsure of if the lifetimes work out in this case. Can someone explain the rules to me?


Solution

  • Suppose we have something like

    union Storage {
        T arr[N];
        Storage() {}
    };
    

    and your inplace_vector holds a Storage storage; member that is default-initialized initially.

    Then indeed the lifetime of the array arr and of its elements has not been started and generally that disallows most uses of the elements of the array until their lifetime has been started.

    Now, if T is a type with a trivial assignment, then there is an exception specific to unions. In that case writing

    storage.arr[i] = /*...*/;
    

    where = uses the trivial assignment operator or a built-in assignment operator, will implicitly cause the lifetime of the array and of the ith array element to begin before it is being assigned to. That's a specific rule valid only in specific expression forms that directly name the union object followed by a combination of . member access and built-in array indexing on the left-hand side of =. See [class.union.general]/5 for the exact rules.

    So, if your type T is trivial, then there is no problem. You can indeed implement it as you suggested.


    If however, T does not have a trivial assignment in the expression, then

    storage.arr[i] = /*...*/;
    

    will indeed have undefined behavior because the lifetime of storage.arr[i] has not started when you try to call the operator= member function on it.

    Instead you need to explicitly start the lifetime of the object. You can do this with a placement-new, or conveniently with std::construct_at (which is a wrapper around a specific placement-new form with the additional benefit of being usable at compile-time).

    So you would write e.g.

    std::construct_at(storage.arr[i]);
    

    to construct the ith element before assigning to it (or you would provide constructor arguments as additional arguments to std::construct_at).

    Unfortunately this doesn't actually work, because it will create a new object of type T and start that objects lifetime. You want that new object to replace the old T object at the ith index (whose lifetime was never started) and have it become the ith element of the array object.

    But, a requirement for a newly-created object to become element of an array object is that the array object itself is within its lifetime. See [intro.object]/2.1.

    The lifetime of the array object hasn't been started though. You can't start the lifetime of the array object with a placement-new either, because that would construct all elements, which you don't want for inplace_vector.

    I think it is always safe to start the lifetime of the array object with std::start_lifetime_as though (assuming T is not const?):

    std::start_lifetime_as<T[N]>(&storage.arr);
    

    You can do this in the default constructor of Storage and it has the benefit of not actually accessing any of the storage.

    Now you can use std::construct_at to individually construct the elements in the array as suggested before.


    Because you need to placement-new each individual element anyway however, there is a simpler approach: You can simply use an array declared as

    alignas(T) std::byte storage[N*sizeof(T)];
    

    to store your objects. When the lifetime of an array of std::byte begins, it implicitly creates objects of implicit-lifetime type nested within it and starts their lifetime, in particular for a T[N] array object itself, but not its elements (if they do not also have implicit-lifetime types). (Note that this applies only to arrays of type std::byte or unsigned char and that of course the alignment and size need to be correctly specified as above.)

    You can then create objects with std::construct_at(reinterpret_cast<T*>(storage + i*sizeof(T))) and retrieve pointers to them with std::launder(reinterpret_cast<T*>(storage + i*sizeof(T))).


    In either case, both approaches have the problem that they will not work in constant expressions at compile-time. std::start_lifetime_as is not constexpr and reinterpret_cast/std::launder is not permitted in constant expressions.

    That's probably also why std::inplace_vector was specified to be usable at compile-time only if the element type is trivial. See the final revision of the inplace_vector proposal [P0843R14].

    The proposal also mentions that value-initialization is required for the constexpr support, but I think that is only because they also want std::inplace_vector to be trivially-copyable if the element type is trivially-copyable and in that case default initialization may cause undefined behavior by copying indeterminate values.

    The paper also links to a reference implementation. This reference implementation uses the approach with a std::byte array for non-trivial types (in which case constexpr support is not required) and directly uses a normal (value-initialized) array for trivial types. It doesn't use a union.

    The reference implementation omits the std::launder call, but that is technically currently required to avoid UB. Given that no compiler seems to perform any optimization that would be affected by the call, the probably omitted it intentionally. There is also currently a proposal to change the rules so that std::launder is not required for this case, see https://github.com/cplusplus/papers/issues/1703.