c++11 return-value-optimization stdasync

How to properly return large data from a std::future in c++11

I'm a bit puzzled what is the proper way to return large data from an async function in c++.

Take for example this code. It creates a large vector in a function and returns the allocated vector.

#include <unistd.h>

#include <iostream>
#include <chrono>

#include <future>
#include <vector>

using timepoint = std::chrono::time_point<std::chrono::system_clock>;

timepoint start_return;

std::vector< int > test_return_of_large_vector(void)
{
    std::vector< int > ret(100000000);
    start_return = std::chrono::system_clock::now();
    return ret;                          // MOVE1
}


int main(void)
{

    timepoint start = std::chrono::system_clock::now();
    auto ret = test_return_of_large_vector();
    timepoint end = std::chrono::system_clock::now();

    auto dur_create_and_return = std::chrono::duration_cast< std::chrono::milliseconds >(end - start);
    auto dur_return_only = std::chrono::duration_cast< std::chrono::milliseconds >(end - start_return);

    std::cout << "create & return time : " << dur_create_and_return.count() << "ms\n";
    std::cout << "return time : " << dur_return_only.count() << "ms\n";

    auto future = std::async(std::launch::async, test_return_of_large_vector);
    sleep(3); // wait long enough for the future to finish its work

    start = std::chrono::system_clock::now();
    ret = future.get();                  // MOVE2
    end = std::chrono::system_clock::now();

    // mind that the roles of start and start_return have changed
    dur_return_only = std::chrono::duration_cast< std::chrono::milliseconds >(end - start);
    dur_create_and_return = std::chrono::duration_cast< std::chrono::milliseconds >(end - start_return);

    std::cout << "duration since future finished: " << dur_create_and_return.count() << "ms\n";
    std::cout << "return time from future: " << dur_return_only.count() << "ms\n";

    return 0;
}

For me this prints

create & return time : 543ms
return time : 0ms
duration since future finished: 2506ms
return time from future: 14ms
//                      ^^^^^^

So apparently, when calling the function in the main thread, return value elision or moving is done. But the return value from the future is apparently copied around. Also, when trying to std::move in the lines marked by MOVE[1,2], the return time from the future.get() call stays the same. On the other hand, when returning a pointer this the return time from future.get() is negligible (0ms for me).

So must large data be returned from futures via a pointer?

Solution

The problem is that you are assigning to ret, which already holds the result of your first call to test_return_of_large_vector. At minimum, then, your code will need to free 100000000 * sizeof int bytes; the move-assignment of vector::operator=(vector&&) is specified to be of constant complexity (for appropriate allocators), but the destructor of the move source will take time.

If you call ret.clear(); ret.shrink_to_fit(); first then "return time from future" comes down to 0ms (example).

Alternatively you could just move-construct a different variable:

auto ret2 = future.get();                  // MOVE2