Search code examples
optimizationeigenmatrix-multiplicationneon

does eigen have self transpose multiply optimization like H.transpose()*H


I have browsed the tutorial of eigen at https://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixArithmetic.html

it said "Note: for BLAS users worried about performance, expressions such as c.noalias() -= 2 * a.adjoint() * b; are fully optimized and trigger a single gemm-like function call."

but how about computation like H.transpose() * H , because it's result is a symmetric matrix so it should only need half time as normal A*B, but in my test, H.transpose() * H spend same time as H.transpose() * B. does eigen have special optimization for this situation , like opencv, it has similar function.

I know symmetric optimization will break the vectorization , I just want to know if eigen have solution which could provide both symmetric optimization and vectorization


Solution

  • You are right, you need to tell Eigen that the result is symmetric this way:

    Eigen::MatrixXd H = Eigen::MatrixXd::Random(m,n);
    Eigen::MatrixXd Z = Eigen::MatrixXd::Zero(n,n);
    Z.template selfadjointView<Eigen::Lower>().rankUpdate(H.transpose());
    

    The last line computes Z += H * H^T within the lower triangular part. The upper part is left unchanged. You want a full matrix, then copy the lower part to the upper one:

    Z.template triangularView<Eigen::Upper>() = Z.transpose();
    

    This rankUpdate routine is fully vectorized and comparable to the BLAS equivalent. For small matrices, better perform the full product.

    See also the respective doc.