How to efficiently assemble a FEM sparse matrix

Deal all, Thank you for your time to read my question. I'm using Eigen3.3.4(http://eigen.tuxfamily.org/index.php?title=Main_Page) to write some FEM code.

I read the Eigen3.3.4 document, and in this website(http://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html), it says we should use Ref<MatrixBase> to avoid the additioinal copy and obtain high performance.

So in my FEM code, for the sparse matrix assemble part, let's say the function is:

FormFE(const Ref<VectorXd> &U,const Ref<VectorXd> &V,
Ref<SparseMatrix<double> > AMATRIX,Ref<VectorXd> RHS)

where U represent the displacement, V represent the velocity term. AMATRIX is my sparse matrix, RHS is the residual term.

Then I try to first initializing my AMATRIX before assemble(I have a tripletList which contains all the non-zero element and its value(I set the value to zero for initializing)) So I tried:

AMATRIX.setFromTriplets(ZeroTripList.begin(),ZeroTripList.end());

But I have an error:

class Eigen::Ref<Eigen::SparseMatrix<double, 0, int> >’ has no member named ‘setFromTriplets

So how could I solve this problem?

One of my solution is using :

FormFE(const Ref<VectorXd> &U,const Ref<VectorXd> &V,
SparseMatrix<double> &AMATRIX,Ref<VectorXd> RHS)

this is working very fine, but I'm not sure it is efficient or not. I'm not very good at cpp :P

Actually, My questions are:

How to efficiently use Eigen(especially for FEM calculation), I use Eigen's VectorXd and MatrixXd almost everywhere in each of my FEM-related function.
How to efficiently assemble a SparseMatrix?
Is it possible to do some OpenMP parallelization for the FEM assemble?
Any helpful suggestions for C++ based FEM coding(library recommendation or any helpful ideas) are welcome!

Thank you. Best regards.

Solution

Yes, passing a SparseMatrix<double> & is the right thing to do here. The purpose of Ref<SparseMatrix> is pass assembled objects that ressemble to a SparseMatrix, like a sub-sparse matrix, a Map<SparseMatrix>...

Using setFromTriplets is also the right thing to do to be sure to get OK performance. Directly inserting the elements using mat.insert(i,j) = val; might be up to x2 faster if you do it properly (i.e., proper call to reserve and proper insertion order). But if you get it wrong, it can also be x100 times slower... See the doc. With SparseMatrix::insert it is also possible to fill the matrix using OpenMP, but this require to be even more careful and strict, here is a typical pattern for this:

int n_cols = ??, n_rows = ??;
std::vector<int> nnz_per_col(n_cols);
// set each nnz_per_col[j] to the exact number
// of non-zero entries in the j-th column (or more, but NOT less)
SparseMatrix<double> mat(n_rows, n_cols);
#pragma omp parallel for
for(int j=0; j<cols; ++j) {
  for each non zero entry i in the j-th column {
    // preferably with increasing i
    double val_i_j = ...;
    mat.insert(i,j) = val_i_j;
  }
}

Of course, you can also work row-wise if that's easier for you. In this case use a SparseMatrix<double,RowMajor>. And of course, you can adjust this pattern to work on blocks of columns/rows, etc.

If for the assembly you need to work with some dense matrix/vectors, then I believe they are pretty small with fixed-size. Then instead of using MatrixXd/VectorXd, better use Matrix<double,N,M> and Matrix<double,N,1> types that are statically allocated. This will prevent numerous memory allocation/deallocation.

Finally, the most important recommendation: if you care about performance, don't forget to profile your code before investigating time and effort if optimizing your code. Also, always bench/profile with compiler optimization ON.