Templated covariance function in Eigen

One of my projects requires a templated covariance function, i.e. taking a MatrixXd, ArrayXXf and/or a .block() thereof as input and returning an expression to be used in further calculations.

q1. As verification, my attempt below seems to work, but does it actually return an Eigen expression? (edit: @Darhuuk confirmed it does not; fortunately, C++14 or higher is not a problem!)

q2. Obviously, I'd need something of an 'internal Matrix' rather than the internal RowVector I came across in the Eigen documentation about templated functions. Hence my question, how should one create an internal MatrixType z = x.rowwise() - x.colwise().mean(), so the function can return (z.transpose()*z)/(x.rows()-1)? The internal rowvector is used in the covariance examples in the Eigen manual (https://eigen.tuxfamily.org/dox/TopicFunctionTakingEigenTypes.html)

q3. Finally, when adding the templated function to a class with a header file, how do I find which Eigen types need to be used to explicitly instantiate the template for e.g. MatrixXd, a .block() of an ArrayXXd, or a Map? All I could find were examples using simple data types (e.g. Storing C++ template function definitions in a .CPP file)

template <typename Derived>
Matrix<typename Derived::Scalar, Derived::ColsAtCompileTime, Derived::ColsAtCompileTime>  
SampleCov( const DenseBase<Derived>& x )
{

  typedef typename internal::plain_row_type<Derived>::type RowVectorType;
  const RowVectorType x_mean = x.colwise().mean();

  return ((x.rowwise() - x_mean).matrix().transpose() * (x.rowwise() - x_mean).matrix()) / (x.rows()-1);
}

Solution

Your function explicitly returns a Matrix object, so no, it does not return an expression object. One way to fix this is by letting the compiler deduce the return type for you (since the expression object will be some gigantic templated mess):

template <typename Derived>
auto SampleCov (DenseBase<Derived> const & x) {
  auto const x_mean = x.colwise().mean();
  return ((x.rowwise() - x_mean).matrix().transpose()
    * (x.rowwise() - x_mean).matrix()) / (x.rows() - 1);
}

This is assuming you're using C++14 or higher. For C++11 using just auto as the return type doesn't cut it, you need a trailing return type:

template <typename Derived>
auto SampleCov (DenseBase<Derived> const & x)
    -> decltype(((x.rowwise() - x.colwise().mean()).matrix().transpose()
      * (x.rowwise() - x.colwise().mean()).matrix()) / (x.rows() - 1)) {
  auto const x_mean = x.colwise().mean();
  return ((x.rowwise() - x_mean).matrix().transpose()
    * (x.rowwise() - x_mean).matrix()) / (x.rows() - 1);
}

Note that I removed the RowVectorType typedef, since I didn't see the point.

Regarding question 2, I think the above examples solve that issue, since there's no more explicitly named types. The auto for both the return type and inside the function takes care of that.

Finally, question 3. It depends on what exactly you want to do. Based on what you're saying, it seems the above functions don't work for MatrixXd objects or objects returned by calling MatrixXd::block().

To me that doesn't really indicate that you need explicit template instantiation, which is what you're asking.

Instead, you could simply make SampleCov's argument type more generic:

template <typename T>
auto SampleCovGeneric (T const & x) {
  auto const x_mean = x.colwise().mean();
  return ((x.rowwise() - x_mean).matrix().transpose()
    * (x.rowwise() - x_mean).matrix()) / (x.rows() - 1);
}

As long as you can call colwise(), matrix(), ... on an object of type T, you're good to go. This is probably a better way to go anyway, since now it's possible to pass in Eigen expressions to SampleCovGeneric. For example:

Eigen::MatrixXd a(2,2);

// aTranspose is NOT a matrix object!
// Its type is Eigen::Transpose<Eigen::Matrix<double, Dynamic, Dynamic>>.
auto aTranspose = a.transpose();

/* Note use of auto and eval() calls!
 * See https://eigen.tuxfamily.org/dox/TopicPitfalls.html.
 */
auto b = SampleCov(aTranspose.eval()).eval();
auto c = SampleCovGeneric(aTranspose).eval();

In the above example, SampleCov expects an object of type DenseBase<Derived>. But aTranspose is not such a type. So we have to explicitly evaluate (i.e. actually calculate) the transpose first. Depending on what happens inside SampleCov that's a useless calculation. E.g. imagine the first thing calculated is the argument (x) its transpose. In this case that would simply be the original a again. But if the transpose is already evaluated then Eigen can't "see" that and needs to calculate the transpose's transpose.

SampleCovGeneric on the other hand accepts any type, so there's no need to evaluate aTranspose first. That (might) allow Eigen to "see through" a bunch of calculations and thus calculate the result in a more optimized manner. See e.g. https://eigen.tuxfamily.org/dox/TopicLazyEvaluation.html.

If you really need to, you can explicitly instantiate templates so they can be split in header & source files. That would go something like this:

/* Header */
#pragma once

#include <utility>

// Reduce trailing return type mess a little
template <class T>
using SampleCovResultType = decltype(((std::declval<T const &>().rowwise()
      - std::declval<T const &>().colwise().mean()).matrix().transpose()
    * (std::declval<T const &>().rowwise()
      - std::declval<T const &>().colwise().mean()).matrix())
  / (std::declval<T const &>().rows() - 1));

template <typename T>
auto SampleCov (T const & x) -> SampleCovResultType<T>;

// Explicit template declaration
extern template
auto SampleCov<Eigen::MatrixXd> (Eigen::MatrixXd const & x)
    -> SampleCovResultType<Eigen::MatrixXd>;

/* Source */
#include "SampleCov.h"

template <class T>
auto SampleCov (T const & x) -> SampleCovResultType<T> {
  auto const x_mean = x.colwise().mean();
  return ((x.rowwise() - x_mean).matrix().transpose()
    * (x.rowwise() - x_mean).matrix()) / (x.rows() - 1);
}

// Explicit template definition
template
auto SampleCov<Eigen::MatrixXd> (Eigen::MatrixXd const & x)
    -> SampleCovResultType<Eigen::MatrixXd>;

I don't think you can get away with using only auto as the return type. Since the function definition is put into a separate source file, the compiler can't see the function body, so it has no way to deduce the return type when you call the function. So you have to use a trailing return type.

Note the following though:

An explicit instantiation declaration (an extern template) prevents implicit instantiations: the code that would otherwise cause an implicit instantiation has to use the explicit instantiation definition provided somewhere else in the program.

I.e. if you try to call SampleCov with a type T for which you did not explicitly instantiate SampleCov, compilation will fail.

Note: All of the above code is untested. Chance of typos is high :).

Edits:

Fixing missing template argument.
Removed remark about calling matrix(). I wondered whether it evaluated the expression, which you don't want it to do. AFAIK, it does not, based on e.g. ArrayBase's documentation.
Added info about explicit instantiation.