I'm playing around with build a histogram in parallel using parallel_reduce:
#include "stdint.h"
#include "tbb/tbb.h"
#include <algorithm>
#include <vector>
#include <functional>
#include <iostream>
#include <numeric>
void buildhistogram(const uint8_t *inputImage, const size_t numElements, double *outputHist){
auto range = tbb::blocked_range<size_t>(0,numElements);
auto buildHistogramFcn = [&](const tbb::blocked_range<size_t>& r, const std::vector<double>& initHist){
std::vector<double> localHist(initHist);
for (size_t idx = r.begin(); idx != r.end(); ++idx){
localHist[inputImage[idx]]++;
}
return localHist;
};
auto reductionFcn = [&](const std::vector<double>& hist1, const std::vector<double>& hist2){
std::vector<double> histOut(hist1.size());
std::transform(hist1.begin(),hist1.end(),hist2.begin(),histOut.begin(),std::plus<double>());
return histOut;
};
std::vector<double> identity(256);
auto output = tbb::parallel_reduce(range, identity, buildHistogramFcn, reductionFcn);
std::copy(output.begin(),output.end(),outputHist);
}
My question concerns the definition of Func in the lambda form of parallel_reduce. If you look at the Intel documentation:
https://software.intel.com/en-us/node/506154
They document the second RHS argument of Func as being const:
Value Func::operator()(const Range& range, const Value& x)
However, if you look at their example code, they define an example where the second RHS is non-const, and in fact they modify and return this variable:
auto intelExampleFcn = [](const blocked_range<float*>& r, float init)->float {
for( float* a=r.begin(); a!=r.end(); ++a )
init += *a;
return init;
};
If I try to declare the variable "initHist" as being non-const and work with this memory directly without allocating and returning a local copy:
auto buildHistogramFcn = [&](const tbb::blocked_range<size_t>& r, std::vector<double>& initHist){
for (size_t idx = r.begin(); idx != r.end(); ++idx){
initHist[inputImage[idx]]++;
}
return initHist;
};
I get a compilation error:
/tbb/include/tbb/parallel_reduce.h:322:24: error: no matching function for call to object of type 'const (lambda at buildhistogramTBB.cpp:16:30)' my_value = my_real_body(range, const_cast(my_value));
I'm interested in whether the second RHS argument of the lambda can actually be non-const, because I'd like to be able to avoid making a copy of the vector from the init argument to a local variable that I return.
Am I misunderstanding something, or is Intel's example incorrect?
The second "non-const" argument in Intel's example is being passed by value. If you were to pass your initHist
vector by value (as opposed to by reference), it would also not need const
. (It would, of course, copy the vector. But this seems to be what you are doing anyway.)