I created a for loop to define a std::vector
of std::future
to execute my function vector<int> identify
and another loop to get the results by calling std::future::get()
as follows:
for (int i = 0; i < NUM_THREADS; ++i) {
VecString::const_iterator first = dirList.begin() + i*share;
VecString::const_iterator last = i == NUM_THREADS - 1 ? dirList.end() : dirList.begin() + (i + 1)*share;
VecString job(first, last);
futures[i] = async( launch::async, [=]() -> VecInt {
return identify(i, job, make_tuple( bIDList, wIDList, pIDList, descriptor), testingDir, binaryMode, logFile, logFile2 );
} );
}
int correct = 0;
int numImages = 0;
for( int i = 0; i != NUM_THREADS; ++i ) {
VecInt ret = futures[i].get();
correct += ret[0];
numImages += ret[1];
}
The job is to process some images and I divide the work roughly equally between each thread. I also embed std::cout
in the function to indicate which thread the results come from.
I expect that after the first thread finishes its work, others should also completes theirs and the loop will print the results out. However, after the first thread finishes, other threads still work. I think they really work, not just print the results out because there is some delay when the function processes a big image. This makes me wonder when the threads really start.
I know from the documentation that each thread starts right away after its initialization but how can you explain my observation? Thank you very much and any help is much appreciated
Because you are using std::launch::async
, it's up to std::async
to determine how to schedule your requests. According to cppreference.com:
The template function async runs the function f asynchronously (potentially in a separate thread which may be part of a thread pool) and returns a
std::future
that will eventually hold the result of that function call.
It does guarantee that they will be threaded, however, and you can infer that the evaluation of your lambda will be scheduled to happen at the next available opportunity:
If the async flag is set (i.e.
policy & std::launch::async != 0
), then async executes the callable object f on a new thread of execution (with all thread-locals initialized) as if spawned bystd::thread(std::forward<F>(f), std::forward<Args>(args)...)
, except that if the function f returns a value or throws an exception, it is stored in the shared state accessible through thestd::future
that async returns to the caller.
For the purposes of your question, however, you just wanted to know when it's executed in relation to your call to get
. It's easy to demonstrate that get
has nothing to do with the execution of async tasks when launched with std::launch::async
:
#include <iostream>
#include <future>
#include <thread>
#include <vector>
#include <chrono>
using namespace std;
int main() {
auto start = chrono::steady_clock::now();
auto timestamp = [start]( ostream & s )->ostream& {
auto now = chrono::steady_clock::now();
auto elapsed = chrono::duration_cast<chrono::microseconds>(now - start);
return s << "[" << elapsed.count() << "us] ";
};
vector<future<int>> futures;
for( int i = 0; i < 5; i++ )
{
futures.emplace_back( async(launch::async,
[=](){
timestamp(cout) << "Launch " << i << endl;
return i;
} ) );
}
this_thread::sleep_for( chrono::milliseconds(100) );
for( auto & f : futures ) timestamp(cout) << "Get " << f.get() << endl;
return 0;
}
Output (live example here):
[42us] Launch 4
[85us] Launch 3
[95us] Launch 2
[103us] Launch 1
[109us] Launch 0
[100134us] Get 0
[100158us] Get 1
[100162us] Get 2
[100165us] Get 3
[100168us] Get 4
These operations are trivial, but if you have long-running tasks then you can expect that some or all of those tasks might still be executing when you call std::future<T>::get()
. In that case, your thread will be suspended until the promise associated with that future is satisfied. Also, because the async tasks may be pooled it's possible that some will not begin evaluation until after others have completed.
If you use instead std::launch::deferred
, then you will get lazy evaluation on the calling thread, and so the output would be something like:
[100175us] Launch 0
[100323us] Get 0
[100340us] Launch 1
[100352us] Get 1
[100364us] Launch 2
[100375us] Get 2
[100386us] Launch 3
[100397us] Get 3
[100408us] Launch 4
[100419us] Get 4
[100430us] Launch 5