Search code examples
c++memory-managementinitializationc++20move

how to handle moving a section of initialized values into overlapped partially initialized and partially uninitialized space in an allocation?


Background

Context, I'm trying to implement std::vector insert on a custom vector type. This applies to all overloads of vector::insert, I'm trying to maintain parity with C++'s std::vector.

In the case that I am forced to re-allocate memory, insert is actually trivial. When I don't need to reallocate though, when my capacity is >= the new size needed with the inserted element, I'm confused. I'm using the list here https://en.cppreference.com/w/cpp/container/vector/insert

In the case where I may not have trivially copyable types, and I may not have default constructable types, and I may not have movable types, then I'm not quite sure how to properly initialize data.

For example, for the following prototype

constexpr iterator insert(const_iterator pos, const T &value);

if inserting value does not trigger a resize of the allocation, then that means I can just use the already made allocation.

I would normally reach for something like copy_backwards, but the problem is that if I do that, I don't know if the type is cheaply movable instead, which if availible I should chose. While this doesn't mean anything for the input value, which should still be copied, I'm confused what needs to happen here, for example, assume I have

pointer_type m_ptr; 
size_type m_size; 
size_type m_capacity;
allocator_type m_allocator; 

I'll attempt something like this to just move the already valid data.

if constexpr (std::is_move_constructible_v<T>) {
    auto pos_idx = static_cast<size_type>(std::distance(cbegin(), pos));
    std::move_backward(m_ptr + pos_idx, m_ptr + m_size, m_ptr + pos_idx + 1);
    ...
}

But the problem with this is that I'm moving into uninitialized memory with the last element. So I could do an un-initialized move the last element first.

if constexpr (std::is_move_constructible_v<T>) {
    std::uninitialized_move(m_ptr + m_size - 1, m_ptr + size, m_ptr + size); 
    auto pos_idx = static_cast<size_type>(std::distance(cbegin(), pos));
    std::move_backward(m_ptr + pos_idx, m_ptr + m_size - 1, m_ptr + pos_idx + 1);
    ...
}

But now the space for the last element is hypothetically uninitialized? so I can't do a move backwards anymore.

What this implies is that I need to do an uninitialized move for every value past the insertion point right? But not only do I need to do that, I need to manually call the uninitialized move function for every value, because I need to effectievly do a std::uninitalized_move_backward(...) (which hopefully anyone looking at this would realize doesn't exist) since the uninit std functions are all not overlap safe? ie:

if constexpr (std::is_move_constructible_v<T>) {
    auto pos_idx = static_cast<size_type>(std::distance(cbegin(), pos));
    auto pos_end_reverse = pos != begin() ? (std::reverse_iterator(pos - 1) : rend()); 
    for(auto pos_r = rbegin(); pos_r != pos_end_reverse; ++pos_r){
        //subtract, because this is a reverse iterator and we are trying to look *forward*. 
        std::uninitialized_move_n(pos_r, 1, pos_r - 1); 
    } 
} 

Question

For the case where I have some initialized values that need to move in to overlapping initialized and uninitialized space in an allocation, is there any way around manually calling uninitialized move here for each element backwards or is there a stdlib method of handling this case?


Solution

  • A better understanding of the requirements here makes the required semantics much simpler than they appeared to be: merely remind yourself that a move leaves the moved-from object in a "valid, but unspecified state". This means that the moved-from object is not left in "uninitialized" state (as you call it).

    This means that during an insert a move into uninitialized space only happens for new values that are relocated past the last valid (initialized) value.

    Using a simple example: the existing vector has six values, vector[0] through vector[5], and you're inserting two new values at the beginning of the vector.

    This means that vector[5] to vector[7] and vector[4] to vector[6] is a move into uninitialized space. All the other moves, vector[3] to vector[5] are not. They are, effectively, a move assignment (since vector[5], and its predecessors are left in "valid but unspecified space).

    In conclusion, an insert that takes place without reallocation ends up a three part operation: 1) uninitialized move into all reserved, uninitialized space that the vector grows into, 2) a move assignment for all other moved values, 3) a move assignment of the new values in the vector that get inserted.

    And, no, there is no standard library algorithm that handles all of that for you, all of this logic must be implemented. But it's really not that complicated.