Search code examples
c++stlshared-ptrstdvectorallocator

C++ : Vector Allocator behavior, memory allocation and smart pointers


Refer to the following code snippet.

According to my understanding:

a) 'p1' and 'p2' objects are created in the stack and get destroyed at the end of the getPoints() method.

b) When p1 and p2 are added to the vector using push_back(), the default Allocator creates new instances of Point and copy the values(x,y) of p1 and p2 into these newly created instances.

My questions are:

1) Is my understanding correct?

If so ;

2) If new Point objects are created by the Allocator, why I see only two lines of "Points created"?

Because I expect to see two lines for p1 and p2 and also two lines for newly created object by the Allocator.

3) How does the Allocator assign original values to x,y fields of the newly created objects? Does it using raw memory copy?

4) Is the shared pointer the recommended way to return a vector from a method?

#include <iostream>
#include <vector>

using namespace std;

struct Point {
    Point() {
        std::cout<< "Point created\n";
        x=0;
        y=0;
    }
    int x;
    int y;
};


std::shared_ptr< vector<Point> > getPoints() {
    std::shared_ptr< vector<Point> > ret =  std::make_shared< vector<Point> >();
    Point p1;
    p1.x=100;
    p1.y=200;

    Point p2;
    p2.x = 1000;
    p2.y = 2000;

    ret->push_back(p1);
    ret->push_back(p2);

    return ret;
}

int main(int argc, char** argv)
{
    std::shared_ptr< vector<Point> > points = getPoints();
    for(auto point : *(points.get())) {
        std::cout << "Point x "<<point.x << " "<< point.y<<"\n";
    }

}

Solution

  • Q: Is my understanding correct?

    A: Your understanding is partially correct.

    • p1 and p2 are created on the stack, using the default no-argument constructor, that you've defined.
    • The Default allocator may be used to allocate more memory for p1 and p2 when you call push_back(), but will not always do so. It will never create default construct a new instance of Point though.

    Q: If new Point objects are created by the Allocator, why I see only two lines of "Points created"?

    A: New objects are not being created by the allocator - the allocator only allocates more memory, if needed. The objects that you insert in the vector are copy constructed. Because you have not created a copy constructor, the compiler has generated one for you.

    Q: How does the Allocator assign original values to x,y fields of the newly created objects? Does it using raw memory copy?

    A: As stated in the previous question, the allocator only allocates memory and does not create or destroy objects. The behavior of copying the fields is done by the copy constructor that gets invoked when you do a push_back. The automatically generated copy constructor will do a member-wise copy construction of each of the class' members. In your case, x and y are primitive types, so they'll be just raw memory copied. If members were complex objects, their copy constructors would be invoked.

    Q: Is the shared pointer the recommended way to return a vector from a method?

    A: This would depend on your use case, and is opinion based. My personal advice, and this is for all kind of objects is:

    • If your use cases allows it, return by value (that is, std::vector<Point> getPoints())
    • If you need dynamically allocated storage, or the object that you want to return can be nothing, because construction failed, return by std::unique_ptr. This applies to pretty much all factory functions that you might want to create. Even if you later want to share the ownership (see point 3), you can construct a shared_ptr by moving from a unique_ptr (std::shared_ptr<T> shared = std::move(unique) );
    • Avoid using shared_ptr unless you really need shared ownership of the ptr. shared_ptr are more complex to reason about, can create hard to debug cycles, leading to memory leaks, and are heavier in terms of performance (because of atomic operations relating to their refcount and additional allocated memory for a control block). If you think you need a shared_ptr, reconsider your design and think if you can use a unique_ptr instead.

    How this works:

    Internally, a std::vector is using memory allocated using the default allocator (or custom user provided, if you provided one) on the heap. This allocation happens behind the scenes, and is independent of the vector's size, and from the number of elements in the vector (but is always >= size()). You can get how many elements the vector has allocated storage for by using the capacity() function. When you call push_back(), what happens:

    1. If there is enough storage (as determined by capacity()) to hold one more element, the argument that you passed to push_back is copy constructed, using the copy constructor if using the push_back( const T& value )variant or moved from if using push_back( T&& value ), by using the move constructor .
    2. If there is no more memory (i.e the new size() > capacity), more memory is allocated that will be sufficient to hold the new elements. How much memory will be allocated is implementation defined. A common pattern is to double the amount of capacity that the vector previously had until a threshold, after which memory is allocated in buckets. You may use reserve() before inserting elements, to make sure that your vector will have enough capacity to hold at least as many elements without new allocations. After new memory has been allocated, the vector reallocates all existing elements into the new storage by either copying them, or moving them if they are not copy-insertable. This reallocation will invalidate all iterators and references to elements in the vector (caveat: the rules for when exactly copy vs move will be used when reallocating is a bit more complex, but this is the general case)