Eigen::SelfAdjointView::rankUpdate slower than A += w*w.transpose()

Tested speed of Eigen::SelfAdjointView::rankUpdate with Eigen::Matrix4d

comparing to naive A += w*w.transpose() and it was 2 times slower.

What im doing wrong?
Can i speed up this computations?

Solution

For small fixed sized expressions you can't save anything with SelfAdjointView::rankUpdate, it rather adds overhead because it needs to make sure that only elements of one half are modified. In your case a simple

A.noalias() += w*w.adjoint();

should give near optimal code (adding the .noalias() avoids a copy into a temporary).