Search code examples
c++arraysdynamic-memory-allocationc++20lifetime

How to create an array and start its lifetime without starting the lifetime of any of its elements?


Arrays of any type are implicit-lifetime objects, and it is possible to to begin the lifetime of implicit-lifetime object, without beginning the lifetime of its subobjects.

As far as I am aware, the possibility to create arrays without beginning the lifetime of their elements in a way that doesn't result in UB, was one of the motivations for implicit-lifetime objects, see http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2020/p0593r6.html.

Now, what is the proper way to do it? Is allocating memory and returning a pointer to array is enough? Or there is something else one needs to be aware of?

Namely, is this code valid and does it create an array with uninitialized members, or we still have UB?

// implicitly creates an array of size n and returns a pointer to it
auto arrPtr = reinterpret_cast<T(*)[]>(::operator new(sizeof(T) * n, std::alignval_t{alignof(T)}) );
// is there a difference between reinterpret_cast<T(*)[]> and reinterpret_cast<T(*)[n]>?
auto arr = *arrPtr; // de-reference of the result in previous line.

The question can be restated as follows.

According to https://en.cppreference.com/w/cpp/memory/allocator/allocate, the allocate function function creates an array of type T[n] in the storage and starts its lifetime, but does not start lifetime of any of its elements.

A simple question - how is it done? (ignoring the constexpr part, but I wouldn't mind if constexpr part is explained in the answer as well).

PS: The provided code is valid (assuming it is correct) for c++20, but not for earlier standards as far as I am aware.

I believe that an answer to this question should answer two similar questions I have asked earlier as well.

  1. Arrays and implicit-lifetime object creation.
  2. Is it possible to allocatate uninialized array in a way that does not result in UB.

EDIT: I am adding few code snippets, to make my question more clear. I would appreciate an answer explaining which one are valid and which ones are not.

PS: feel free to replace malloc with aligned version, or ::operator new variation. As far as I am aware it doesn't matter.

Example #1

T* allocate_array(std::size_t n)
{
    return reinterpret_cast<T*>( malloc(sizeof(T) * n) ); 
    // does it return an implicitly constructed array (as long as 
    // subsequent usage is valid) or a T* pointer that does not "point"
    // to a T object that was constructed, hence UB
    // Edit: if we take n = 1 in this example, and T is not implicit-lifetime 
    // type, then we have a pointer to an object that has not yet been
    // constructed and and doesn't have implicit lifetime - which is bad
}

Example #2.

T* allocate_array(std::size_t n)
{
    // malloc implicitly constructs - reinterpet_cast should a pointer to 
    // suitably created object (a T array), hence, no UB here. 
    T(*)[] array_pointer = reinterpret_cast<T(*)[]>(malloc(sizeof(T) * n) );
    // The pointer in the previous line is a pointer to valid array, de-reference
    // is supposed to give me that array
    T* array = *array_pointer;
    return array;
}

Example #3 - same as 2 but size of array is known.

T* allocate_array(std::size_t n)
{
    // malloc implicitly constructs - reinterpet_cast should a pointer to 
    // suitably created object (a T array), hence, no UB here. 
    T(*)[n] n_array_pointer = reinterpret_cast<T(*)[n]>(malloc(sizeof(T) * n) );
    // The pointer in the previous line is a pointer to valid array, de-reference
    // is supposed to give me that array
    T* n_array = *n_array_pointer;
    return n_array;
}

Are any of these valid?


The answer

While wording of the standard is not 100% clear, after reading the paper more carefully, the motivation is to make casts to T* legal and not casts to T(*)[]. Dynamic construction of arrays. Also, the changes to the standard by the authors of the paper imply that the cast should be to T* and not to T(*)[]. Hence, the accepting the answer by Nicol Bolas as the correct answer for my question.


Solution

  • The whole point of implicit object creation is that it is implicit. That is, you don't do anything to get it to happen. Once IOC occurs on a piece of memory, you may use the memory as if the object in question exists, and so long as you do that, your code works.

    When you get your T* back from allocator_traits<>::allocate, if you add 1 to the pointer, then the function has returned an array of at least 1 element (the new pointer could be the past-the-end pointer for the array). If you add 1 again, then the function has returned an array of at least 2 elements. Etc. None of this is undefined behavior.

    If you do something inconsistent with this (casting to a different pointer type and acting as though there is an array there), or if you act as though the array extends beyond the size of the storage that IOC applies to, then you get UB.

    So allocator_traits::allocate doesn't really have to do anything, so long as the memory that the allocator allocated implicitly creates objects.


    // does it return an implicitly constructed array (as long as 
    // subsequent usage is valid) or a T* pointer that does not "point"
    // to a T object that notconstructed, hence UB
    

    Neither. It returns a pointer (to type T) to storage into which objects may have been implicitly created already. Which objects have been implicitly created depends on how you use this storage. And merely doing a cast doesn't constitute "using" the storage.

    It isn't the reinterpret_cast that causes UB; it's using the pointer returned by an improper reinterpret_cast that's the problem. And since IOC works based on the operation that would have caused UB, IOC doesn't care what you cast the pointer to.

    Part and parcel of the IOC rules is the corollary "suitable created object" rule. This rule says that certain operations (like malloc and operator new) return a pointer to a "suitable created object". Essentially it's back to quantum superposition: if IOC retroactively creates an object to make your code work, then these functions retroactively returns a pointer to whichever object that was created that makes your code work.

    So if your code uses the pointer as a T* and does pointer arithmetic on that pointer, then malloc returned a pointer to the first element of an array of Ts. How big is that array? That depends: how big was the allocation, and how far did you do your pointer arithmetic? Does it have live Ts in them? That depends: do you try to access any Ts in the array?