Search code examples
c++eigen

Set Vector of Eigen Matrices to 0


I'm adapting the Hough Transform for a special application, and for this I need to store a lot of Eigen Matrixes in a vector and I need them all to be 0 at the beginning. This is how I have initialized this:

typedef Eigen::Matrix<int,30,150> HoughMatrix
std::vector<HoughMatrix> hough_spaces(num_spaces)

My question now is, what is the fastest way to make all elements of all those Matrixes equal to 0? I tried looping over each Vector element and do:

hough_spaces[i].setZero()

But that was rather slow. Is there a faster way? Or a way to directly initialize them as 0? Thank you for your help


Solution

  • First of all, Eigen::Matrix<int,30,150> will by default be aligned to 16 bytes, which on 64bit systems or with C++17 will most likely work properly, but you may otherwise face some caveats. An easy way to workaround any issues regarding alignment is to write

    typedef Eigen::Matrix<int,30,150, Eigen::DontAlign> HoughMatrix;
    

    Now the idiomatic way to write what you want would be to write

    std::vector<HoughMatrix> hough_spaces(num_spaces, HoughMatrix::Zero());
    

    However, this will result in a loop of memcpy calls (at least for gcc and clang: https://godbolt.org/z/ULixBm).

    Alternatively, you could create a vector of uninitialized HoughMatrixes and apply std::memset on them:

    std::vector<HoughMatrix> hough_spaces(num_spaces);
    std::memset(hough_spaces.data(), 0, num_spaces*sizeof(HoughMatrix));
    

    Note that for that to run without Eigen having to loop through all elements requires to have HoughMatrix not aligned (as shown at the beginning) or to disable alignment-assertions: https://godbolt.org/z/nDJqV5

    If you don't actually need the std::vector functionality (mostly the ability to copy and resize), you could just allocate some memory using calloc and free it after usage. To be leak-safe, this can be encapsulated into a std::unique_ptr:

    // unique_ptr with custom deallocator (use a typedef, if you need this more often):
    std::unique_ptr<HoughMatrix[], void(&)(void*)> hough_spaces(static_cast<HoughMatrix*>(std::calloc(num_spaces, sizeof(HoughMatrix))), std::free);
    if(!hough_spaces) throw std::bad_alloc(); // Useful, if you actually handle bad-allocs. If you ignore failed callocs, you'll likely segfault when accessing the data.
    

    Clang and gcc will optimize this into a single calloc/free pair: https://godbolt.org/z/m4rzRq


    A totally different approach would be to try using a 3D Tensor instead of a vector of matrices:

    typedef Eigen::Tensor<int, 3> HoughSpace;
    HoughSpace hough_spaces(num_spaces,30,150);
    hough_spaces.setZero();
    

    Looking at the generated assembly this looks semi-optimal though, even with -O3 -DNDEBUG.


    Overall, note that benchmarking anything memory-related may be misleading. E.g., the call to calloc may return nearly instantaneous but on a lower level point to unallocated pages, which makes actually accessing them the first time more expensive.