Search code examples
c++pointersreferencepass-by-reference

How to return the reference of a declared vector in the method body?


I have this method :

vector<float> MyObject::getResults(int n = 1000)
{
    vector<float> results(n, 0);
    // do some stuff
    return results;
}

Of course this is not optimized and I want to return a reference of this vector but I cannot simply do this :

const vector<float>& MyObject::getResults(int n = 1000)
{
    vector<float> results(n, 0);
    // do some stuff
    return results;
}

This doesn't work, the vector will be destroy at the end of method because it's a local variable.

So the only solution I found to solve this problem is to create a private vector in MyObject and return a reference to this vector :

const vector<float>& MyObject::getResults(int n = 1000)
{
    this->results.clear();
    this->results.resize(n, 0);
    // do some stuff
    return results;
}

Is this the right way to do that? Do you have any other solution to propose?


Solution

  • What's most efficient?

    Return by value. Don't worry, no copying occurs. This is best practice:

    // Use this
    vector<float> getResults(int n = 1000);
    

    Why is this? Local variables returned from a function are not copied. They are moved into the location where the return value will be stored:

    // Result moved into v; no copying occurs
    vector<float> v = getResults(); 
    
    // Result moved into memory allocated by new; no copying occurs
    vector<float>* q = new vector<float>(getResults()); 
    

    How does this work?

    When a function returns an object, it returns it in one of two ways:

    • In the registers
    • In memory

    You can only return simple objects like ints and doubles in the registers. For values returned in memory, the function is passed a pointer to the location that it needs to place the return value.

    When you call new vector<float>(getResults());, the following things happen:

    • The computer allocates memory for a new vector
    • It gives the location of that memory to getResults(), along with any other parameters.
    • getResults constructs the vector in that memory, no need to copy.

    What about returning a reference to a member variable?

    Generally speaking, this is a premature optimization that may not provide much, or any, benefit, and it makes your code more complex and more prone to bugs.

    If you assign the output of getResults to a vector, then the data will get copied anyways:

    MyObject m; 
    vector<float> = m.getResults(); // if getResults returns a const reference, the data gets copied
    

    On the other hand, if you assign the output of getResults to a const reference, this can make managing the lifetime of MyObject much more complex. In the below example, the reference you return is invalidated as soon as the function ends because m gets destroyed.

    vector<float> const& evilDoNotUseThisFunction() {
        MyObject m;
        vector<float> const& ref = m.getResults();
        return ref; // This is a bug - ref is invalid when m gets destroyed
    }
    

    What's the difference between copying and moving for std::vector?

    Copying loops over all the elements of a vector. When a vector is copied, all the data stored by the vector gets copied:

    vector<float> a = getVector(); // Get some vector
    
    vector<float> b = a // Copies a
    

    This is equivalent to the following code:

    vector<float> a = getVector(); // Get some vector
    
    vector<float> b(a.size()); // Allocate vector of size a
    
    // Copy data; this is O(n)
    float* data = b.data();
    for(float f : a) {
        *data = f;
        data++;
    }
    

    Moving doesn't loop over any elements. When a vector is constructed by move, it's as though it's swapped with an empty vector:

    vector<float> a = getVector(); // Get some vector
    
    vector<float> b = std::move(a); // Move a into b
    

    is equivalent to:

    vector<float> a = getVector(); // Get some vector
    
    vector<float> b; // Make empty vector (no memory allocated)
    
    std::swap(a, b); // Swap a with b; very fast; this is O(1)
    

    TL;DR: Copying copies all the data in a loop. Moving just swaps out who owns the memory.

    How do we know results gets moved? C++11 requires that local variables get moved automatically when they're returned. You don't have to call move.

    Does a swap actually occur? In many cases, no. A swap is already cheap, but the compiler can be clever and optimize out the swap entirely. It does this by constructing your results vector in the memory where it'll be returning results. This is called Named Return Value Optimization. See https://shaharmike.com/cpp/rvo/#named-return-value-optimization-nrvo