Search code examples
c++c++14stdasyncpackaged-taskstd-future

Why std::future is different returned from std::packaged_task and std::async?


I got to know the reason that future returned from std::async has some special shared state through which wait on returned future happened in the destructor of future. But when we use std::pakaged_task, its future does not exhibit the same behavior. To complete a packaged task, you have to explicitly call get() on future object from packaged_task.

Now my questions are:

  1. What could be the internal implementation of future (thinking std::async vs std::packaged_task)?
  2. Why the same behavior was not applied to future returned from std::packaged_task? Or, in other words, how is the same behavior stopped for std::packaged_task future?

To see the context, please see the code below:

It does not wait to finish countdown task. However, if I un-comment // int value = ret.get();, it would finish countdown and is obvious because we are literally blocking on returned future.

    // packaged_task example
#include <iostream>     // std::cout
#include <future>       // std::packaged_task, std::future
#include <chrono>       // std::chrono::seconds
#include <thread>       // std::thread, std::this_thread::sleep_for

// count down taking a second for each value:
int countdown (int from, int to) {
  for (int i=from; i!=to; --i) {
    std::cout << i << std::endl;
    std::this_thread::sleep_for(std::chrono::seconds(1));
  }
  std::cout << "Lift off!" <<std::endl;
  return from-to;
}

int main ()
{
   std::cout << "Start " << std::endl;
  std::packaged_task<int(int,int)> tsk (countdown);   // set up packaged_task
  std::future<int> ret = tsk.get_future();            // get future

  std::thread th (std::move(tsk),10,0);   // spawn thread to count down from 10 to 0

//   int value = ret.get();                  // wait for the task to finish and get result

  std::cout << "The countdown lasted for " << std::endl;//<< value << " seconds.\n";

  th.detach();   

  return 0;
}

If I use std::async to execute task countdown on another thread, no matter if I use get() on returned future object or not, it will always finish the task.

// packaged_task example
#include <iostream>     // std::cout
#include <future>       // std::packaged_task, std::future
#include <chrono>       // std::chrono::seconds
#include <thread>       // std::thread, std::this_thread::sleep_for

    // count down taking a second for each value:
    int countdown (int from, int to) {
      for (int i=from; i!=to; --i) {
        std::cout << i << std::endl;
        std::this_thread::sleep_for(std::chrono::seconds(1));
      }
      std::cout << "Lift off!" <<std::endl;
      return from-to;
    }
    
    int main ()
    {
       std::cout << "Start " << std::endl;
      std::packaged_task<int(int,int)> tsk (countdown);   // set up packaged_task
      std::future<int> ret = tsk.get_future();            // get future
    
      auto fut = std::async(std::move(tsk), 10, 0);   

    
    //   int value = fut.get();                  // wait for the task to finish and get result
    
      std::cout << "The countdown lasted for " << std::endl;//<< value << " seconds.\n";

      return 0;
    }

Solution

  • std::async has definite knowledge of how and where the task it is given is executed. That is its job: to execute the task. To do that, it has to actually put it somewhere. That somewhere could be a thread pool, a newly created thread, or in a place to be executed by whomever destroys the future.

    Because async knows how the function will be executed, it has 100% of the information it needs to build a mechanism that can communicate when that potentially asynchronous execution has concluded, as well as to ensure that if you destroy the future, then whatever mechanism that's going to execute that function will eventually get around to actually executing it. After all, it knows what that mechanism is.

    But packaged_task doesn't. All packaged_task does is store a callable object which can be called with the given arguments, create a promise with the type of the function's return value, and provide a means to both get a future and to execute the function that generates the value.

    When and where the task actually gets executed is none of packaged_task's business. Without that knowledge, the synchronization needed to make future's destructor synchronize with the task simply can't be built.

    Let's say you want to execute the task on a freshly-created thread. OK, so to synchronize its execution with the future's destruction, you'd need a mutex which the destructor will block on until the task thread finishes.

    But what if you want to execute the task in the same thread as the caller of the future's destructor? Well, then you can't use a mutex to synchronize that since it all on the same thread. Instead, you need to make the destructor invoke the task. That's a completely different mechanism, and it is contingent on how you plan to execute.

    Because packaged_task doesn't know how you intend to execute it, it cannot do any of that.

    Note that this is not unique to packaged_task. All futures created from a user-created promise object will not have the special property of async's futures.

    So the question really ought to be why async works this way, not why everyone else doesn't.

    If you want to know that, it's because of two competing needs: async needed to be a high-level, brain-dead simple way to get asynchronous execution (for which sychronization-on-destruction makes sense), and nobody wanted to create a new future type that was identical to the existing one save for the behavior of its destructor. So they decided to overload how future works, complicating its implementation and usage.