Search code examples
c++c++11parameter-passingreturn-valuereturn-value-optimization

Which "return" method is better for large data in C++/C++11?


This question was triggered by confusion about RVO in C++11.

I have two ways to "return" value: return by value and return via reference parameter. If I don't consider the performance, I prefer the first one. Since return by value is more natural and I can easily distinguish the input and the output. But, if I consider the efficiency when return large data. I can't decide, because in C++11, there is RVO.

Here is my example code, these two codes do the same work:

return by value

struct SolutionType
{
    vector<double> X;
    vector<double> Y;
    SolutionType(int N) : X(N),Y(N) { }
};

SolutionType firstReturnMethod(const double input1,
                               const double input2);
{
    // Some work is here

    SolutionType tmp_solution(N); 
    // since the name is too long, I make alias.
    vector<double> &x = tmp_solution.X;
    vector<double> &y = tmp_solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }

    return tmp_solution;
}

return via reference parameter

void secondReturnMethod(SolutionType& solution,
                        const double input1,
                        const double input2);
{
    // Some work is here        

    // since the name is too long, I make alias.
    vector<double> &x = solution.X;
    vector<double> &y = solution.Y;

    for (...)
    {
    // some operation about x and y
    // after that these two vectors become very large
    }
}

Here are my questions:

  1. How can I ensure that RVO is happened in C++11?
  2. If we are sure that RVO is happened, in nowadays C++ programming, which "return" method do you recommend? Why?
  3. Why there are some library use the return via reference parameter, code style or historical reason?

UPDATE Thanks to these answers, I know the first method is better in most way.

Here is some useful related links which help me understand this problem:

  1. How to return large data efficiently in C++11
  2. In C++, is it still bad practice to return a vector from a function?
  3. Want Speed? Pass by Value.

Solution

  • First of all, the proper technical term for what you are doing is NRVO. RVO relates to temporaries being returned:

    X foo() {
       return make_x();
    }
    

    NRVO refers to named objects being returned:

    X foo() {
        X x = make_x();
        x.do_stuff();
        return x;
    }
    

    Second, (N)RVO is compiler optimization, and is not mandated. However, you can be pretty sure that if you use modern compiler, (N)RVOs are going to be used pretty aggressively.

    Third of all, (N)RVO is not C++11 feature - it was here long before 2011.

    Forth of all, what you have in C++11 is a move constructor. So if your class supports move semantics, it is going to be moved from, not copied, even if (N)RVO is not happening. Unfortunatelly, not everything can be semantically moved efficiently.

    Fifth of all, return by reference is a terrible antipattern. It ensures that object will be effectively created twice - first time as 'empty' object, second time when populated with data - and it precludes you from using objects for which 'empty' state is not a valid invariant.