i have a sparse matrix from eigen library defined as:
Eigen::SparseMatrix<float> MyMatrix(2**n, 2**n).
In addition I use the function reserve:
MyMatrix.reserve(Eigen::VectorXi::Constant(2**n, n+1));
This matrix has n+1 nonzero numbers each column and it has 2^n columns
I want the actual size in bytes this object occupies.
If I use:
MyMatrix.size()
It gives n_rows*n_columns.
I'm sure it's not the actual size stored, as I checked with the memory of the computer.
For example I can create on my computer a 2^25 * 2^25 sparse sized matrix of floats in this way, which should occupy ~ 10^15 bytes, which is simply impossible.
If I write
sizeof(MyMatrix)
it gives 72, whichever n I use. It's probably something related to the class itself and not the object actually saved in it
Update 2:
This is the right way to compute its size:
It's one float and one int per reserved (or used) element plus two ints per column plus sizeof(Matrix) fixed overhead
As discussed in comments, here I explain the sparse matrix format
You call reserve()
, so the specific sub-format is uncompressed. This means we have
2**25 * 26 * 4 byte
)2**25 * 4 byte
)That gives us (2 * 2**25 * 26 * 4 + 2 * 2**25 * 4) / 1024**3
= 6.75 GiB
Let's put that to the test:
#include <Eigen/Dense>
#include <Eigen/Sparse>
#include <malloc.h>
int main()
{
int size = 1<<25;
int nonzero_per_row = 26;
Eigen::SparseMatrix<float> mat(size, size);
mat.reserve(Eigen::VectorXi::Constant(size, nonzero_per_row));
malloc_stats();
}
This prints:
Arena 0:
system bytes = 135168
in use bytes = 74400
Total (incl. mmap):
system bytes = 2952941568
in use bytes = 2952880800
max mmap regions = 4
max mmap bytes = 7247773696
As you can see, four mmapped allocations with a total size of 7,247,773,696 byte; which is 6.75 GiB.
The reason this will work on your laptop with less memory is that you don't use that memory, yet. The memory is mmapped, but not initialized, so the operating system maps it all to the single zero page it has for this exact purpose. See for example Allocating more memory than there exists using malloc
One thing that should be noted is that this format uses a simple int
to denote the array offset of the nonzero elements. This means the total number of nonzeros (and reserved elements) must remain below 2**31-1
(signed int
range). With 2**25 * 26
elements, you are already close to 2**30
elements.
Unless you know that this is the absolute upper limit with no concern for growth, I recommend you change the format to Eigen::SparseMatrix<float, Eigen::ColMajor, Eigen::Index>
, using Eigen::Index
, a.k.a. std::ptrdiff_t
, instead of int
. This will bump up the memory use to about 10.25 GiB but it will remove all concerns about potential integer overflows.