Search code examples
c++eigen

Basic ways to speed up a simple Eigen program


I'm looking for the fastest way to do simple operations using Eigen. There are so many datastructures available, its hard to tell which is the fastest.

I've tried to predefine my data structures, but even then my code is being outperformed by similar Fortran code. I've guessed Eigen::Vector3d is the fastest for my needs, (since its predefined), but I could easily be wrong. Using -O3 optimization during compile time gave me a big boost, but I'm still running 4x slower than a Fortran implementation of the same code.

I make use of an 'Atom' structure, which is then stored in an 'atoms' vector defined by the following:

struct Atom {
    std::string element;
    //double x, y, z;
    Eigen::Vector3d coordinate;
};
std::vector<Atom> atoms;

The slowest part of my code is the following:

distance = atoms[i].coordinate - atoms[j].coordinate;
distance_norm = distance.norm();

Is there a faster data structure I could use? Or is there a faster way to perform these basic operations?


Solution

  • As you pointed out in your comment, adding the -fno-math-errno compiler flag gives you a huge increase in speed. As to why that happens, your code snipped shows that you're doing a sqrt via distance_norm = distance.norm();.

    This makes the compiler not set ERRNO after each sqrt (that's a saved write to a thread local variable), which is faster and enables vectorization of any loop that is doing this repeatedly.The only disadvantage to this is that the IEEE adherence is lost. See gcc man.

    Another thing you might want to try is adding -march=native and adding -mfma if -march=native doesn't turn it on for you (I seem to remember that in some cases it wasn't turned on by native and had to be turned on by hand - check here for details). And as always with Eigen, you can disable bounds checking with -DNDEBUG.

    SoA instead of AoS!!! If performance is actually a real problem, consider using a single 4xN matrix to store the positions (and have Atom keep the column index instead of the Eigen::Vector3d). It shouldn't matter too much in the small code snippet you showed, but depending on the rest of your code, may give you another huge increase in performance.