c++multithreading for-loop thread-safety boost-thread

How to use multi-threading within a loop that iterates through a point cloud in C++?

I have made a function that estimates the normal vectors of a 3D Point Cloud and it takes a lot of time to run on a cloud of size 2 million. I want to multi-thread by calling the same function on two different points at the same time but it didn't work (it was creating hundreds of threads). Here is what I tried:

// kd-tree used for finding neighbours
pcl::KdTreeFLANN<pcl::PointXYZRGB> kdt;

// cloud iterators
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it = pt_cl->points.begin();
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it1;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it2;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it3;
pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it4;

// initializing tree
kdt.setInputCloud(pt_cl);

// loop exit condition
bool it_completed = false;

while (!it_completed)
{
    // initializing cloud iterators
    cloud_it1 = cloud_it;
    cloud_it2 = cloud_it++;
    cloud_it3 = cloud_it++;

    if (cloud_it3 != pt_cl->points.end())
    {
        // attaching threads
        boost::thread thread_1 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it1, kdt, radius, max_neighbs);
        boost::thread thread_2 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it2, kdt, radius, max_neighbs);
        boost::thread thread_3 = boost::thread(geom::vectors::find_normal, pt_cl, cloud_it3, kdt, radius, max_neighbs);

        // joining threads
        thread_1.join();
        thread_2.join();
        thread_3.join();

        cloud_it++;
    }

    else
    {
        it_completed = true;
    }
}

As you can see I am trying to call the same function on 3 different points at the same time. Any suggestions for how to make this work? Sorry for the poor code, I'm tired and thank you in advance.

EDIT: here is the find_normal function Here are the parameters:

@param pt_cl is a pointer to the point cloud to be treated (pcl::PointCloud<PointXYZRGB>::Ptr)
@param cloud_it is an iterator of this cloud (pcl::PointCloud<PointXYZRGB>::iterator)
@param kdt is the kd_tree used to find the closest neighbours of a point
@param radius defines the range in which to search for the neighbours of a point
@param max_neighbs is the maximum number of neighbours to be returned by the radius search

// auxilliary vectors for the k-tree nearest search
    std::vector<int> pointIdxRadiusSearch; // neighbours ids
    std::vector<float> pointRadiusSquaredDistance; // distances from the source to the neighbours

    // the vectors of which the cross product calculates the normal
    geom::vectors::vector3 *vect1;
    geom::vectors::vector3 *vect2;
    geom::vectors::vector3 *cross_prod;
    geom::vectors::vector3 *abs_cross_prod;
    geom::vectors::vector3 *normal;
    geom::vectors::vector3 *normalized_normal;

    // vectors to average
    std::vector<geom::vectors::vector3> vct_toavg;

    // if there are neighbours left
    if (kdt.radiusSearch(*cloud_it, radius, pointIdxRadiusSearch, pointRadiusSquaredDistance, max_neighbs) > 0)
    {

        for (int pt_index = 0; pt_index < (pointIdxRadiusSearch.size() - 1); pt_index++)
        {
            // defining the first vector
            vect1 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[pt_index + 1]]);

            // defining the second vector; making sure there is no 'out of bounds' error
            if (pt_index == pointIdxRadiusSearch.size() - 2)
                vect2 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[1]]);


            else
                vect2 = geom::vectors::create_vect2p((*cloud_it), pt_cl->points[pointIdxRadiusSearch[pt_index + 2]]);

            // adding the cross product of the two previous vectors to our list
            cross_prod = geom::vectors::cross_product(*vect1, *vect2);
            abs_cross_prod = geom::aux::abs_vector(*cross_prod);
            vct_toavg.push_back(*abs_cross_prod);

            // freeing memory
            delete vect1;
            delete vect2;
            delete cross_prod;
            delete abs_cross_prod;
        }

        // calculating the normal
        normal = geom::vectors::vect_avg(vct_toavg);

        // calculating the normalized normal
        normalized_normal = geom::vectors::normalize_normal(*normal);

        // coloring the point
        geom::aux::norm_toPtRGB(&(*cloud_it), *normalized_normal);

        // freeing memory
        delete normal;
        delete normalized_normal;

        // clearing vectors
        vct_toavg.clear();
        pointIdxRadiusSearch.clear();
        pointRadiusSquaredDistance.clear();

        // shrinking vectors
        vct_toavg.shrink_to_fit();
        pointIdxRadiusSearch.shrink_to_fit();
        pointRadiusSquaredDistance.shrink_to_fit();
    }

Solution

Since I don't quite get it how the result data is being stored, I'm going to suggest a solution based on OpenMP that matches the code you've posted.

// kd-tree used for finding neighbours
pcl::KdTreeFLANN<pcl::PointXYZRGB> kdt;

#pragma openmp parallel for schedule(static)
for (pcl::PointCloud<pcl::PointXYZRGB>::iterator cloud_it = pt_cl->points.begin();
    cloud_it < pt_cl.end();
    ++cloud_it) {
    geom::vectors::find_normal, pt_cl, cloud_it, kdt, radius, max_neighbs);
}

Note that you should be using the < comparison, and not the != one, -that's how OpenMP works (it wants random access iterators). I'm using the static schedule since every element should take more or less identical time to process. If that's not the case, try using schedule(dynamic) instead.

This solution uses OpenMP, and you may investigate e.g. TBB as well, though it has a higher entrance barrier than OpenMP and uses an OOP-style API.

Also, repeating what I've said in the comments already: OpenMP as well as TBB are going to handle thread management and load distribution for you. You only pass them hints (such as schedule(static)) on how to do it to so as to better suit your needs.

Other than that, please, do get in the habit of repeating as little code as you can; ideally, no code should be duplicated. E.g. when you declare many variables of the same type, or call a certain function a few times in a row with a similar pattern, etc. I also see excessive commenting in the code, with an unclear reason behind it.