I have browsed the tutorial of eigen at https://eigen.tuxfamily.org/dox-devel/group__TutorialMatrixArithmetic.html
it said "Note: for BLAS users worried about performance, expressions such as c.noalias() -= 2 * a.adjoint() * b; are fully optimized and trigger a single gemm-like function call."
but how about computation like H.transpose() * H , because it's result is a symmetric matrix so it should only need half time as normal A*B, but in my test, H.transpose() * H spend same time as H.transpose() * B. does eigen have special optimization for this situation , like opencv, it has similar function.
I know symmetric optimization will break the vectorization , I just want to know if eigen have solution which could provide both symmetric optimization and vectorization
You are right, you need to tell Eigen that the result is symmetric this way:
Eigen::MatrixXd H = Eigen::MatrixXd::Random(m,n);
Eigen::MatrixXd Z = Eigen::MatrixXd::Zero(n,n);
Z.template selfadjointView<Eigen::Lower>().rankUpdate(H.transpose());
The last line computes Z += H * H^T
within the lower triangular part. The upper part is left unchanged. You want a full matrix, then copy the lower part to the upper one:
Z.template triangularView<Eigen::Upper>() = Z.transpose();
This rankUpdate
routine is fully vectorized and comparable to the BLAS equivalent. For small matrices, better perform the full product.
See also the respective doc.