Search code examples
c++c++11mfcppl

Class object inside or outside a parallel_for / parallel_for_each?


I have been studying about parallel loops (C++11) and testing them with MS visual studio 2013. I am clear about them (lambdas esp.) which are pretty cool.

But my concern is that I have to call a function that performs a simple euclidean distance measure. Function by itself is clear, but I have to move the function to a class called EuclideanDistance and do the euclidean math inside the function Match(vectorA,vectorB) on two vectors which simply is some norm(...) calculation.And returns a floating point value.

Now how do I go about this inside a parallel_for/parallel_foreach loop? Do I create the class object inside the loop or keeping the class object outside the loop will cause inconsistencies? If I understood correctly about parallel loops the function over which it works is basically a clean copy for every thread launched. Does this happen in the case of class functions? My hunch is no! Unless I create an object inside the class as shown in the second code snippet.

e.g. For the sake of readability, I keep the code abbreviated.

vectorA; // Floating point array of 1024 entries.
concurrent_queue vectorQ; // each entry in the queue is a 1024 array
EuclideanDistance euclid;
parallel_for_each(begin,end,[&](auto item)
{
    auto distance = euclid.Match(vectorA,item);
});

Or this will be the right way of doing it?

parallel_for_each(begin,end,[&](auto item)
{
EuclideanDistance euclid;
    auto distance = euclid.Match(vectorA,item);
});

The whole class is nothing more than a single function.

    class EuclideanDistance
    {
public:
       float Match(vectorA,vectorB)
        {
           return norm(vectorA,vectorB); 
        }
    };

Any pitfalls would be highly appreciated!


Solution

  • You are correct that if you define your EuclideanDistance object outside the parallel_for_each lambda body, it will be shared across all the worker threads executing the parallel_for_each. This would be a problem if your Match() function had side effects affecting shared state in your EuclideanDistance object but in that case it is likely defining the object inside the lambda (which would give each execution of the loop body its own local instance) would have different results from defining it outside.

    As long as any functions you call on the EuclideanDistance object have no side effects / do not modify shared state then you are fine using one object defined outside the lambda. If you were calling functions with side effects then you would need to do your own synchronization which would likely impact the performance gain of the parallel_for_each significantly.