class T {
size_t *pData; // Memory allocated in the constructor
friend T operator+(const T& a, const T& b);
T operator+(const T& a, const T& b){ // Op 1
T c; // malloc()
*c.pData = *a.pData + *b.pData;
return c;
T do_something(){
/* Implementation details */
return T_Obj;
A simple class T
with dynamic memory. Consider
T a,b,c;
c = a + b; // Case 1
c = a + do_something(b); // Case 2
c = do_something(a) + b; // Case 3
c = do_something(a) + do_something(b); // Case 4
We can do better by addiitonally defining,
T& operator+(const T& a, T&& b){ // Op 2
// no malloc() steeling data from b rvalue
*b.pData = *a.pData + *b.pData;
return b;
Case 2 now only uses 1 malloc(), but what about Case 3? do we need to define Op 3?
T& operator+(T&& a, const T& b){ // Op 3
// no malloc() steeling data from a rvalue
*b.pData = *a.pData + *b.pData;
return b;
Further, if we do define Op 2 and Op 3, given the fact that an rvalue reference can bind to an lvalue reference, the compiler now has two equally plausible function definitions to call in Case 4
T& operator+(const T& a, T&& b); // Op 2 rvalue binding to a
T& operator+(T&& a, const T& b); // Op 3 rvalue binding to b
the compiler would complain about an ambiguous function call, would defining Op 4 help work around the compiler's ambiguous function call problem? as we gain no additional performance with Op 4
T& operator+(T&& a, T&& b){ // Op 4
// no malloc() can steel data from a or b rvalue
*b.pData = *a.pData + *b.pData;
return b;
With Op 1, Op 2, Op 3 and Op 4, we have
If all my understanding is correct, we will need four function signatures per operator. This somehow doesn't seem right, as it is quite a lot of boilerplate and code duplication per operator. Am I missing something? Is there an elegant way of achieving the same?
This is performant and elegant but makes use of a macro.
#include <type_traits>
#include <iostream>
#define OPERATOR_Fn(Op) \
template<typename T1, typename T2> \
friend auto operator Op (T1&& a, T2&& b) \
-> typename std::enable_if<std::is_same<std::decay_t<T1>,std::decay_t<T2>>::value,std::decay_t<T1>>::type \
{ \
constexpr bool a_or_b = !std::is_reference<T1>::value; \
std::decay_t<T1> c((a_or_b? std::forward<T1>(a) : std::forward<T2>(b))); \
*c.pData = *c.pData Op (!a_or_b? *a.pData : *b.pData); \
return c; \
} \
struct T {
T(): pData(new size_t(1)) {std::cout << "malloc" << '\n';}
~T() {delete pData;}
T(const T& b): pData(new size_t(1)) { *pData = *b.pData; std::cout << "malloc" << '\n';}
T(T&& b){
pData = b.pData;
b.pData = nullptr;
std::cout<< "move constructing" << '\n';
size_t *pData; // Memory allocated in the constructor
You can simplify the type_traits expression to make the code more readable by defining something like this
template <typename T1, typename T2>
struct enable_if_same_on_decay{
static constexpr bool value = std::is_same<std::decay_t<T1>, std::decay_t<T2>>::value;
typedef std::enable_if<value,std::decay_t<T>>::type type;
template <typename T1, typename T2>
using enable_if_same_on_decay_t = typename enable_if_same_on_decay<T1,T2>::type;
The complex type_traits expression
-> typename std::enable_if<std::is_same<std::decay_t<T1>,std::decay_t<T2>>::value,std::decay_t<T1>>::type
simply becomes
-> enable_if_same_on_decay_t<T1,T2>