Eigen has amazingly fast GEMM implementation so i want to use it inside my pet project tensor library.
If i understand correctly it's possible via Eigen::Map. I wrote simple example and defined EIGEN_NO_MALLOC
to make sure that there are no unwanted allocations.
It's works well with simple matrix multiplication like C += A * B
.
But unfortunately it cannot handle C += alpha * A * B
(GEMM like) situation.
#include <iostream>
#include <vector>
#define EIGEN_NO_MALLOC
#include "Eigen/Core"
int main()
{
using Scalar = float;
using namespace Eigen;
std::vector<Scalar> aDat = {1, 2, 3, 4};
std::vector<Scalar> bDat = {1, 2, 3, 4};
std::vector<Scalar> cDat = {1, 2, 3, 4};
Map<Matrix<Scalar, -1, -1, RowMajor>, Unaligned> a(aDat.data(), 2, 2);
Map<Matrix<Scalar, -1, -1, RowMajor>, Unaligned> b(bDat.data(), 2, 2);
Map<Matrix<Scalar, -1, -1, RowMajor>, Unaligned> c(cDat.data(), 2, 2);
//Ok
c.noalias() += a * b;
//Assertion `false && "heap allocation is forbidden.....
c.noalias() += 2 * a * b;
return 0;
}
c.noalias() += 2 * a * b;
gives me following runtime error
a.out: path_to_eigen/Eigen/src/Core/util/Memory.h:129: void Eigen::internal::check_that_malloc_is_allowed(): Assertion `false && "heap allocation is forbidden (EIGEN_NO_MALLOC is defined)"' failed.
Is it possible to call c.noalias() += someScalar * a * b;
without allocations?
PS My eigen version is 3.3.2
, gcc
version is 7.2.0
Sorry for my poor English
This looks like a bug introduced by a small-size optimization. For products (KxM)*(MxN)
where K+M+N < EIGEN_GEMM_TO_COEFFBASED_THRESHOLD
(which by default is 20
), Eigen switches to a lazy product implementation, which apparently thinks evaluating into a temporary is a good idea. Your example will actually work fine, e.g. for a 5x5 * 5x10
product.
I filed a bug for that issue: http://eigen.tuxfamily.org/bz/show_bug.cgi?id=1562
If you want a quick work-around, define -D EIGEN_GEMM_TO_COEFFBASED_THRESHOLD=1
, this will always use the big matrix-GEMM implementation.
On the other hand, if you know your matrix size at compile time, you should better use the corresponding fixed-size type (Matrix<Scalar, 2, 2, RowMajor>
in your case).