Search code examples
c++for-loopparallel-processingtbb

TBB Free Image lambda array comparison error


I hae been playing around with Thread building blocks with Free Image Plus on linux. I have been trying to compare the speeds between a sequential and parallel approach when subtracting one image from another, however I noticed that the final outcome when using the parallel approach generates some anomalies that I am unsure now to solve and am in need of some advice.

my question is: Why does the image seem to generate more array comparison errors when using parallel but work fine when using sequential (The image is supposed to be black with a few white spots, so the white pixels in the second image are comparison errors between the 2 image pixel arrays (of type RGBQUAD)).

RGBQUADs are declared before the call to these methods and act as global variables.

RGBQUAD rgb;    
RGBQUAD rgb2;

https://i.sstatic.net/QF8RS.jpg "Sequential".

for (auto y = 0; y < height; y++)
{
    for(auto x= 0; x < width; x++)
    {
        inputImage.getPixelColor(x, y, &rgb);
        inputImage2.getPixelColor(x, y, &rgb2);
        rgbDiffVal[y][x].rgbRed = abs(rgb.rgbRed - rgb2.rgbRed);
        rgbDiffVal[y][x].rgbBlue = abs(rgb.rgbBlue - rgb2.rgbBlue);
        rgbDiffVal[y][x].rgbGreen = abs(rgb.rgbGreen - rgb2.rgbGreen);
    }
}

https://i.sstatic.net/nZulZ.jpg "with TBB parallel".

parallel_for(blocked_range2d<int,int>(0,height, 0, width), [&] (const blocked_range2d<int,int>&r) {
    auto y1 = r.rows().begin();
    auto y2 = r.rows().end();
    auto x1 = r.cols().begin();
    auto x2 = r.cols().end();

    for (auto y = y1; y < y2; y++) {
        for (auto x = x1; x < x2; x++) {
            inputImage.getPixelColor(x, y, &rgb);
            inputImage2.getPixelColor(x, y, &rgb2);

            rgbDiffVal[y][x].rgbRed = abs(rgb.rgbRed - rgb2.rgbRed);
            rgbDiffVal[y][x].rgbBlue = abs(rgb.rgbBlue - rgb2.rgbBlue);
            rgbDiffVal[y][x].rgbGreen = abs(rgb.rgbGreen - rgb2.rgbGreen);
        }
    }
});

I believe it may have something to do with passing the reference pointer inside a lambda that is copying values by reference anyway as this is the only thing I can think of that may affect the process. (rgb, rgb2). I have observed that If I change the parallel for blocked range to height and width, this solves the issue, however this then defeats the point of using a parallel method in the first place.


Solution

  • If these variables are supposed to temporarily store pixel colors, maybe you just need to move the declarations into the lambda, making variables local to each thread. – Alexey Kukanov.

    Variables were indeed placed outside the bounds for the lambda so each thread was modifying the referenced variable causing a race condition where one thread would try to read the data from the variable as another was modifying it.