I have heard that a vector of vectors is bad in terms of performance. For example, I have the following 2D std::vector
:
std::vector< std::vector<char> > characterMatrix;
// for iterating
for ( int row = 0; row < getY_AxisLen( ); ++row )
{
for ( int column = 0; column < getX_AxisLen( ); ++column )
{
std::cout << characterMatrix[ row ][ column ];
}
}
In this approach, the matrix gets printed in 3-12 milliseconds on my system. I would be glad if I saw a decrease of e.g. 1-3 milliseconds.
As far as I know, each of the internal vectors (i.e. rows) is stored in a different location on the heap memory. So this causes lots of fragmentation.
Not only that but the sizeof(std::vector)
in my compiler returns 24 (bytes). So this means that for instance if characterMatrix
has 50 rows (aka internal vectors) then it'll allocate 24*50 == 1200 bytes on the heap just to store the control blocks of those 50 vectors and this is in addition to the space taken by the actual data (char
s) in the matrix.
Now if I want to keep all the char
s in a single contiguous block of memory maybe I can write it as a 1D vector like:
std::vector< char > characterMatrix;
// for iterating
for ( int row = 0; row < getY_AxisLen( ); ++row )
{
for ( int column = 0; column < getX_AxisLen( ); ++column )
{
std::cout << characterMatrix[ row * getX_AxisLen( ) + column ]; // is this correct?
}
}
Is this a valid way of doing it? Can someone tell me what things should I keep in mind if I want to change the implementation of my matrix variable in such a way? What are the possible downsides?
"have heard" together with performance is never the right approach. To address performance, the golden rule is: Benchmark first!
Also, performance isn't always the most important thing. Typically, you should only optimize for performance if you find that your application isn't fast enough for your purposes. Then, follow these steps:
Yes, 2D vectors are an indication that you might have not ideal performance, since as you said, the data is not all in one place.
Yet then again, as you do it curently, the computation of indices done on each access inside of innermost loop also is "expensive".
So, having your data in one large vector instead of a 2D vector field can be faster; especially if you always have to address all elements anyway and don't need "neighbourhood" access, which means that you can simply have one loop iterating from 0 to what you call getY_AxisLen() * getX_AxisLen()
. Having a benchmark in place, as hinted to above, can help you in finding out which optimizations make sense!
To address the reduced readability/maintanability, it could be helpful in your case to abstract away the actual data structure used for storing the 2D data, so that the actual implementation of how data is stored is hidden from the places where the data is accessed.