I want to call parallel_reduce to sum the vector elements. But I find that if the vector elements is enough, the result is not correct. Please help me how to use this function.
// prepare data
const size_t allNum = 1000000;
std::vector<double> a;
for (int i = 0; i < allNum; ++i)
{
a.push_back(double(i + 1));
}
// λ func
auto f = [&]() -> double {
return tbb::parallel_reduce(tbb::blocked_range<size_t>(0, allNum),
0.0,
[&](const tbb::blocked_range<size_t>& r, double init) -> double {
for (int i = r.begin(); i < r.end(); ++i)
{
init += a[i];
}
return init;
},
[](double f, double s) -> double {
return f + s;
}
/*std::plus<double>()*/);
};
// call λ func, get the result
double correctResult = (1.0 + 1000000.0) * 500000.0;
double sum = f(); // sum != correctResult
// sum is different every loop
I tried running the above code. It was working fine and got the correct results too!
For more information about parallel_reduce, refer to the below links: https://software.intel.com/content/www/us/en/develop/documentation/tbb-documentation/top/intel-threading-building-blocks-developer-reference/algorithms/parallelreduce-template-function.html
https://link.springer.com/content/pdf/10.1007%2F978-1-4842-4398-5.pdf
Thanks, Santosh