Search code examples
cudacpugpureducethrust

Thrust: How to directly control where an algorithm invocation executes?


The following code has no information that may lead it to run at CPU or GPU. I wonder where is the "reduce" operation executed?

#include <thrust/iterator/counting_iterator.h>
...
// create iterators
thrust::counting_iterator<int> first(10);
thrust::counting_iterator<int> last = first + 3;

first[0]   // returns 10
first[1]   // returns 11
first[100] // returns 110

// sum of [first, last)
thrust::reduce(first, last);   // returns 33 (i.e. 10 + 11 + 12)

Furthermore,

thrust::transform_reduce(
    thrust::counting_iterator<unsigned int>(0), 
    thrust::counting_iterator<unsigned int>(N), 
    MyOperation(data), 0 ,thrust::plus<unsigned int>())

Even though data is defined as thrust::host_vector, this function tries to be executed at GPU (compiler gives related errors, because the filename ends with .cpp). How may I make the code to run at CPU. Or should I look for another way to perform the same operation, e.g. not using counting_iterator?


Solution

  • By default, algorithm invocations like this execute on the device backend (i.e., the GPU in your case).

    If you're using Thrust 1.7 or better, use the thrust::host execution policy to force an algorithm invocation to execute on the host (i.e., the CPU):

    #include <thrust/execution_policy.h>
    
    ...
    
    thrust::reduce(thrust::host, first, last);
    
    ...
    
    thrust::transform_reduce(thrust::host,
                             first,
                             last,
                             MyOperation(data),
                             0,
                             thrust::plus<unsigned int>());
    

    If you're using Thrust 1.6, you can retarget the invocations to the host by retagging an existing iterator:

    #include <thrust/iterator/retag.h>
    
    ...
    
    thrust::reduce(thrust::retag<thrust::host_system_tag>(first),
                   thrust::retag<thrust::host_system_tag>(last));
    
    ...
    
    thrust::transform_reduce(thrust::retag<thrust::host_system_tag>(first),
                             thrust::retag<thrust::host_system_tag>(last),
                             MyOperation(data),
                             0,
                             thrust::plus<unsigned int>());
    

    If you're using an older version of Thrust prior to 1.6, you need to pass host_space_tag to counting_iterator as a template parameter:

    thrust::reduce(thrust::counting_iterator<unsigned int, thrust::host_space_tag>(0),
                   thrust::counting_iterator<unsigned int, thrust::host_space_tag>(N));