c++multidimensional-array dynamic-memory-allocation stdvector

Switching from a 2D vector into a 1D vector

I have heard that a vector of vectors is bad in terms of performance. For example, I have the following 2D std::vector:

std::vector< std::vector<char> > characterMatrix;

// for iterating
for ( int row = 0; row < getY_AxisLen( ); ++row )
{
    for ( int column = 0; column < getX_AxisLen( ); ++column )
    {
        std::cout << characterMatrix[ row ][ column ];
    }
}

In this approach, the matrix gets printed in 3-12 milliseconds on my system. I would be glad if I saw a decrease of e.g. 1-3 milliseconds.

As far as I know, each of the internal vectors (i.e. rows) is stored in a different location on the heap memory. So this causes lots of fragmentation.
Not only that but the sizeof(std::vector) in my compiler returns 24 (bytes). So this means that for instance if characterMatrix has 50 rows (aka internal vectors) then it'll allocate 24*50 == 1200 bytes on the heap just to store the control blocks of those 50 vectors and this is in addition to the space taken by the actual data (chars) in the matrix.

Now if I want to keep all the chars in a single contiguous block of memory maybe I can write it as a 1D vector like:

std::vector< char > characterMatrix;

// for iterating
for ( int row = 0; row < getY_AxisLen( ); ++row )
{
    for ( int column = 0; column < getX_AxisLen( ); ++column )
    {
        std::cout << characterMatrix[ row * getX_AxisLen( ) + column ]; // is this correct?
    }
}

Is this a valid way of doing it? Can someone tell me what things should I keep in mind if I want to change the implementation of my matrix variable in such a way? What are the possible downsides?

Solution

"have heard" together with performance is never the right approach. To address performance, the golden rule is: Benchmark first!

Also, performance isn't always the most important thing. Typically, you should only optimize for performance if you find that your application isn't fast enough for your purposes. Then, follow these steps:

First, you need to make sure that the part of the code that you're looking at is actually a performance bottleneck in your application (use a profiler for that). No need to optimize something that takes only 1% of your computation time in total - even if you speed it up by a factor of 10, you would only reduce the overall execution time by 0.9%! If you have discovered that this 2D vector access is indeed the bottleneck, then benchmark it to have a baseline for future experiments.
Second, your code needs to be correct. No use in it being fast if it is not doing the right thing after you've optimized it for performance. Here I'd recommend to put tests in place (that is, record the results with the naive/unoptimized approach, so that when you optimize, you know if you still get the same result).
Third, performance optimizations tend to negatively affect readability and/or maintanability of your code. However, these can be very important, especially for code parts that are critical to be well understood, or that tend to change a lot.

Yes, 2D vectors are an indication that you might have not ideal performance, since as you said, the data is not all in one place. Yet then again, as you do it curently, the computation of indices done on each access inside of innermost loop also is "expensive". So, having your data in one large vector instead of a 2D vector field can be faster; especially if you always have to address all elements anyway and don't need "neighbourhood" access, which means that you can simply have one loop iterating from 0 to what you call getY_AxisLen() * getX_AxisLen(). Having a benchmark in place, as hinted to above, can help you in finding out which optimizations make sense! To address the reduced readability/maintanability, it could be helpful in your case to abstract away the actual data structure used for storing the 2D data, so that the actual implementation of how data is stored is hidden from the places where the data is accessed.