Search code examples
c++data-structures

How is vector implemented in C++


I am thinking of how I can implement std::vector from the ground up.

How does it resize the vector?

realloc only seems to work for plain old stucts, or am I wrong?


Solution

  • it is a simple templated class which wraps a native array. It does not use malloc/realloc. Instead, it uses the passed allocator (which by default is std::allocator).

    Resizing is done by allocating a new array and copy constructing each element in the new array from the old one (this way it is safe for non-POD objects). To avoid frequent allocations, often they follow a non-linear growth pattern.

    UPDATE: in C++11, the elements will be moved instead of copy constructed if it is possible for the stored type.

    In addition to this, it will need to store the current "size" and "capacity". Size is how many elements are actually in the vector. Capacity is how many could be in the vector.

    So as a starting point a vector will need to look somewhat like this:

    template <class T, class A = std::allocator<T> >
    class vector {
    public:
        // public member functions
    private:
        // NOTE: a "real" implementation would use the typedefs provided by the allocator. 
        // But I'm keeping it simple for clarity
        T*          data_;
        std::size_t capacity_;
        std::size_t size_;
        A           allocator_;
    };
    

    The other common implementation is to store pointers to the different parts of the array. This cheapens the cost of end() (which no longer needs an addition) ever so slightly at the expense of a marginally more expensive size() call (which now needs a subtraction). In which case it could look like this:

    template <class T, class A = std::allocator<T> >
    class vector {
    public:
        // public member functions
    private:
        // NOTE: a "real" implementation would use the typedefs provided by the allocator. 
        // But I'm keeping it simple for clarity
        T* data_;         // points to first element
        T* end_capacity_; // points to one past internal storage
        T* end_;          // points to one past last element
        A  allocator_;
    };
    

    I believe gcc's libstdc++ uses the latter approach, but both approaches are equally valid and conforming.

    NOTE: This is ignoring a common optimization where the empty base class optimization is used for the allocator. I think that is a quality of implementation detail, and not a matter of correctness.