I need to set sparsity pattern of Eigen::SparseMatrix
which I already know (I have unique sorted column indices and row offsets). Obviously it's possible via setFromTriplets
but unfortunately setFromTriplets
requires a lot of additional memory (at least in my case)
I wrote small example
const long nRows = 5000000;
const long nCols = 100000;
const long nCols2Skip = 1000;
//It's quite big!
const long nTriplets2Reserve = nRows * (nCols / nCols2Skip) * 1.1;
Eigen::SparseMatrix<double, Eigen::RowMajor, long> mat(nRows, nCols);
std::vector<Eigen::Triplet<double, long>> triplets;
triplets.reserve(nTriplets2Reserve);
for(long row = 0; row < nRows; ++row){
for(long col = 0; col < nCols; col += nCols2Skip){
triplets.push_back(Eigen::Triplet<double, long>(row, col, 1));
}
}
std::cout << "filling mat" << std::endl << std::flush;
mat.setFromTriplets(triplets.begin(), triplets.end());
std::cout << "Finished! nnz " << mat.nonZeros() << std::endl;
//Stupid way to check memory consumption
std::cin.get();
In my case this example consumes something about 26Gb at peak (between lines "filling mat" and "Finished") and 18Gb after all. (I made all checks via htop
). ~8Gb overhead is quite big for me (in my "real world" task i have bigger overhead).
So i have two questions:
Eigen::SparseMatrix
with as little overhead as possible?setFromTriplets
require so much memory?Please let me know if my example is wrong.
My Eigen version is 3.3.2
PS Sorry for my English
EDIT: It's looks like inserting (with preallocation) each triplet manually works faster and requires less memory at peak. But I still want to know is it possible to set sparsity pattern manually.
Ad 1: You can be even a bit more efficient than plain insert
by using the internal functions startVec
and insertBack
, if you can guarantee that you insert elements in lexicographical order.
Ad 2: If you use setFromTriplets
you need approximately twice final the size of the matrix (plus the size of your Triplet container), since the elements are first inserted in a transposed version of the matrix, which is then transposed to the final matrix in order to make sure that all inner vectors are sorted. If you know the structure of your matrix ahead, this is obviously quite a waste of memory, but it is intended to work on arbitrary input data.
In your example you have 5000000 * 100000 / 1000 = 5e8 elements. A Triplet
requires 8+8+8 = 24 bytes (making about 12Gb for the vector
) and each element of the sparse matrix requires 8+8=16 bytes (one double
for the value, one long
for the inner index), i.e., about 8Gb per matrix, so in total you require about 28Gb which is about 26 Gib.
Bonus:
If your matrix has some special structure, which can be stored more efficiently, and you are willing to dig deeper into the Eigen internals, you may also consider implementing a new type inheriting from Eigen::SparseBase<>
(but I don't recomment this, unless memory/performance is very critical for you, and you are willing to go through a lot of "sparsely" documented internal Eigen code ...). However, in that case it is probably easier to think about what you intend to do with your matrix, and try to implement only special operations for that.