Search code examples
c++arraysperformance-testingdynamic-memory-allocationcellular-automata

C++ - Performance of static arrays, with variable size at launch


I wrote a cellular automaton program that stores data in a matrix (an array of arrays). For a 300*200 matrix I can achieve 60 or more iterations per second using static memory allocation (e.g. std::array).

I would like to produce matrices of different sizes without recompiling the program every time, i.e. the user enters a size and then the simulation for that matrix size begins. However, if I use dynamic memory allocation, (e.g. std::vector), the simulation drops to ~2 iterations per second. How can I solve this problem? One option I've resorted to is to pre-allocate a static array larger than what I anticipate the user will select (e.g. 2000*2000), but this seems wasteful and still limits user choice to some degree.

I'm wondering if I can either

a) allocate memory once and then somehow "freeze" it for ordinary static array performance?

b) or perform more efficient operations on the std::vector? For reference, I am only performing matrix[x][y] == 1 and matrix[x][y] = 1 operations on the matrix.

According to this question/answer, there is no difference in performance between std::vector or using pointers.

EDIT:

I've rewritten the matrix, as per UmNyobe' suggestion, to be a single array, accessed via matrix[y*size_x + x]. Using dynamic memory allocation (sized once at launch), I double the performance to 5 iterations per second.

As per PaulMcKenzie's comment, I compiled a release build and got the performance I was looking for (60 or more iterations per second). However, this is the foundation for more, so I still want to quantify the benefit of one method over the other more thoroughly, so I used a std::chrono::high_resolution_clock to time each iteration, and found that the performance difference between dynamic and static arrays (after using a single array matrix representation) to be within the margin of error (450~600 microseconds per iteration).

The performance during debugging is a slight concern however, so I think I'll keep both, and switch to a static array when debugging.


Solution

  • For reference, I am only performing

    matrix[x][y]
    
    • Red flag! Are you using vector<vector<int>>for your matrix representation? This is a mistake, as rows of your matrix will be far apart in memory. You should use a single vector of size rows x cols and use matrix[y * row + x]
    • Furthermore, you should follow the approach where you index first by rows and then by columns, ie matrix[y][x] rather than matrix[x][y]. Your algorithm should also process the same way. This is due to the fact that with matrix[y][x] (x, y) and (x + 1, y) are one memory block from each other while with any other mechanism elements (x,y) and (x + 1, y), (x, y + 1) are much farther away.

    Even if there is performance decrease from std::array to std::vector (as the array can have its elements on the stack, which is faster), a decent algorithm will perform on the same magnitude using both collections.