I am trying to write an efficient multistep solver in C++ using the Eigen library. To do so, I need a few variables that keep track of the history. One of them is of type Eigen::VectorX<Eigen::Matrix<double, 9, 10>> DD
which is resized to 10000 somewhere in my initialization code.
Even though I know the size of DD
at compile time, I need to allocate it on the heap because DD
is too large for the stack (at least that's what Visual Studio 2022 tells me).
And here things are getting confusing to me. I'll try to explain why:
You may have noticed that DD
is a VectorX
of size 10000 living on the heap. Its elements are of type Matrix<double, 9, 10>
, hence living on the stack (?). Writing this text already sounds wrong to me. Is it possible to have an object that is allocated both on heap and stack? Or is the compiler moving everything on the heap?
If allocating both on heap and stack is possible: does it influence the performance when reading from the DD
?
I thought I replace Matrix<double, 9, 10>
by MatrixXd
and give it a try. Well, it works but things are slown down by 15-20%. So, somehow I think this question gives an answer to my first and second question but I am really not convinced yet.
If I could replace VectorX
by Vector<..., 10000>
would this increase performance? How could I "force" my program to use stack instead of heap?
What other or better options do I have for DD
to be defined? I replaced Eigen::Vector
by std::vector
which didn't really change anything, at least on performance side.
I need to read from DD
all the time, so any help is highly appreciated.
Memory is memory, your CPU has not concept of stack vs heap. The difference is that they are different data structures. Allocating on element on the stack is very cheap, you only have to increase the stack pointer. If you deallocate in the reverse order, this is also cheap, you just decrease the stack pointer. If you have many small allocations such as local variables in a function, the stack is a suitable data structure. For large working data, the allocation cost is negligible, and it is fine to allocate this on the heap.
Eigen::VectorX<Eigen::Matrix<double, 9, 10>>
points to a contiguous block of memory, so the elements of type Eigen::Matrix<double, 9, 10>>
are located in a heap-allocated array. The matrices themselves are not separately allocated as you can compute the memory address from the index and the base pointer of DD
.DD
is in the outer-loop, so less critical to performance.