I hae been playing around with Thread building blocks with Free Image Plus on linux. I have been trying to compare the speeds between a sequential and parallel approach when subtracting one image from another, however I noticed that the final outcome when using the parallel approach generates some anomalies that I am unsure now to solve and am in need of some advice.
my question is: Why does the image seem to generate more array comparison errors when using parallel but work fine when using sequential (The image is supposed to be black with a few white spots, so the white pixels in the second image are comparison errors between the 2 image pixel arrays (of type RGBQUAD)).
RGBQUADs are declared before the call to these methods and act as global variables.
RGBQUAD rgb;
RGBQUAD rgb2;
https://i.sstatic.net/QF8RS.jpg "Sequential".
for (auto y = 0; y < height; y++)
{
for(auto x= 0; x < width; x++)
{
inputImage.getPixelColor(x, y, &rgb);
inputImage2.getPixelColor(x, y, &rgb2);
rgbDiffVal[y][x].rgbRed = abs(rgb.rgbRed - rgb2.rgbRed);
rgbDiffVal[y][x].rgbBlue = abs(rgb.rgbBlue - rgb2.rgbBlue);
rgbDiffVal[y][x].rgbGreen = abs(rgb.rgbGreen - rgb2.rgbGreen);
}
}
https://i.sstatic.net/nZulZ.jpg "with TBB parallel".
parallel_for(blocked_range2d<int,int>(0,height, 0, width), [&] (const blocked_range2d<int,int>&r) {
auto y1 = r.rows().begin();
auto y2 = r.rows().end();
auto x1 = r.cols().begin();
auto x2 = r.cols().end();
for (auto y = y1; y < y2; y++) {
for (auto x = x1; x < x2; x++) {
inputImage.getPixelColor(x, y, &rgb);
inputImage2.getPixelColor(x, y, &rgb2);
rgbDiffVal[y][x].rgbRed = abs(rgb.rgbRed - rgb2.rgbRed);
rgbDiffVal[y][x].rgbBlue = abs(rgb.rgbBlue - rgb2.rgbBlue);
rgbDiffVal[y][x].rgbGreen = abs(rgb.rgbGreen - rgb2.rgbGreen);
}
}
});
I believe it may have something to do with passing the reference pointer inside a lambda that is copying values by reference anyway as this is the only thing I can think of that may affect the process. (rgb, rgb2). I have observed that If I change the parallel for blocked range to height and width, this solves the issue, however this then defeats the point of using a parallel method in the first place.
If these variables are supposed to temporarily store pixel colors, maybe you just need to move the declarations into the lambda, making variables local to each thread. – Alexey Kukanov.
Variables were indeed placed outside the bounds for the lambda so each thread was modifying the referenced variable causing a race condition where one thread would try to read the data from the variable as another was modifying it.