c++arrays performance operator-overloading hpc

C++ array operator overhead

I remember reading a while back some code that allowed the compiler to do some work and simplify an expression like this one:

// edit: yes the parameters where meant to be passed by reference
//       and maintain constness sorry !
template< typename T >
std::vector<T> operator+( const std::vector<T>& a, const std::vector<T>& b )
{
    assert( a.size() == b.size() );
    std::vector<T> o; o.reserve( a.size() );
    for( std::vector<T>::size_type i = 0; i < a.size(); ++i )
        o[i] = a[i] + b[i];
    return o;
}

// same for operator* but a[i] * b[i] instead

std::vector<double> a, b, c, d, e;

// do some initialization of the vectors

e = a * b + c * d

Where normally a new vector would be created and allocated for each operator instead the compiler would instead only create one copy and do all the operations onto it.

What is this technique?

Solution

As @Agnew mentioned very early, the technique you're describing is expression templates.

This is typically done with the mathematical concept of a vector¹, not a std::vector. The broad strokes are:

Don't have math operations on your vectors return the result. Instead, have them return a proxy object that represents the operation that eventually needs to be done. a * b could return a "multiplication proxy" object that just holds const references to the two vectors that should be multiplied.
Write math operations for these proxies too, allowing them to be chained together, so a * b + c * d becomes (TempMulProxy) + (TempMulProxy) becomes (TempAddProxy), all without doing any of the math or copying any vectors.
Write an assignment operator that takes your proxy object for the right-side object. This operator can see the entire expression a * b + c * d and do that operation efficiently on your vector, while knowing the destination. All without creating multiple temporary vector objects.

¹ or matrix or quaternion, etc...*