Search code examples
c++performancematrixeigeneigen3

Eigen rowwise addition/subtraction performance


During profiling of my program i found that following lines is bottleneck

// Some big nested loop
{
    const auto inpRow = inpMap.row(counter);
    outMap.row(adjRow) -= inpRow;
    outMap.row(colInd) += inpRow;
}

outMap and inpMap are Eigen::Map<Eigen::MatrixRX<Scalar>> where Eigen::MatrixRX defined as Eigen::Matrix<Scalar, -1, -1, Eigen::RowMajor> i.e. row major matrix.

Is there a way to improve performance of such operations? (Except parallel for of course)


Solution

  • There is not much you can do as such expressions should already be fully vectorized. Nevertheless here are some tips:

    • Make sure you enabled compiler optimizations, -O3 -march=native
    • Then measure the time it takes and compute the FLOPS to see how far you are from the theoretical peak performance of your CPU (disable turbo-boost for that experiment)
    • If you're very far away the theoretical peak, then you're very likely suffering from cache misses. You might reduce them by splitting the two assignments in chunk lower than 16kB. You might get a speed-up up to x2 from that.