Search code examples
c++vectorstlcudathrust

Thrust to STL copy doesn't work as intended


I'm not sure how thrust::copy to STL vector actually works. When I do the following, it gives me expected result:

struct TestOperation
{
    TestOperation(){}

    __host__ __device__
   CustomPoint operator()(const CustomPoint& point)
   {
       CustomPoint pt;
       pt.x = point.x * 2;
       pt.y = point.y * 2;
       pt.z = point.z;
       return pt;
   }
};
void CudaLoader::TestLoader(std::vector<CustomPoint>& customPoints) //Host vector reference
    {
       thrust::device_vector<CustomPoint> devicePoints(customPoints.begin(), customPoints.end());
       thrust::device_vector<CustomPoint> output;
       output.reserve(devicePoints.size());
       thrust::transform(devicePoints.begin(), devicePoints.end(), output.begin(), TestOperation());
       for (int i = 0; i < customPoints.size(); i++)
       {
           customPoints[i] = output[i];
       }
    }

But looping trough all the elements especially when there are many of them doesn't seem optimal to me so I wanted to use copy. But when I try to do:

thrust::copy(output.begin(), output.end(), customPoints.begin());

instead of the loop, then I don't get expected result - host stl vector which reference was given as parameter remains unchanged. In addition, output.size() returns 0, but I see that storage size is correct. Why is that?


Solution

  • The source of the problem is this:

       thrust::device_vector<CustomPoint> output;
       output.reserve(devicePoints.size()); 
    

    reserve only changes the guaranteed minimum storage allocation for the vector. It doesn't change its size. In the code above output.size() is still 0. Also note that thrust::transform doesn't alter the size of the output vector. As long as there is enough valid memory to hold the output of the transformation, no illegal memory access error will be produced by the thrust closure kernel which performs the transform operation.

    Do this instead:

       thrust::device_vector<CustomPoint> output;
       output.resize(devicePoints.size());
       thrust::transform(devicePoints.begin(), devicePoints.end(), output.begin(), TestOperation());
    

    Then

    thrust::copy(output.begin(), output.end(), customPoints.begin());
    

    will work as expected because output has a non-zero size.