Search code examples
cudathrust

how to get the index of thrust foreach


I am trying to using thrust for each to give device vector certain values here is the code

const uint N = 222222; 
struct assign_functor
{
  template <typename Tuple>
  __device__ 
  void operator()(Tuple t)
  {  
    uint x = threadIdx.x + blockIdx.x * blockDim.x;
    uint y = threadIdx.y + blockIdx.y * blockDim.y;
    uint offset = x + y * blockDim.x * gridDim.x; 

    thrust::get<0>(t) = offset; 
  }
};
int main(int argc, char** argv)
{ 

  thrust::device_vector <float> d_float_vec(N);  

  thrust::for_each(
    thrust::make_zip_iterator( 
      thrust::make_tuple(d_float_vec.begin()) 
    ), 
    thrust::make_zip_iterator( 
      thrust::make_tuple(d_float_vec.end())
    ), 
    assign_functor()
  );

  std::cout<<d_float_vec[10]<<" "<<d_float_vec[N-2] 
}

the output of d_float_vec[N-2] is supposed to be 222220; but it turns out 1036. whats wrong with my code??

I know I could use thrust::sequence to give a sequence values to the vector. I just want to know how to get the real index for thrust foreach function. Thanks!


Solution

  • As noted in comments, your approach is never likely to work because you have assumed a number of things about the way thrust::for_each works internally which are probably not true, including:

    • You implicitly are assuming that for_each uses a single thread to process each input element. This is almost certainly not the case; it is much more likely that thrust will process multiple elements per thread during the operation.
    • You are also assuming that execution happens in order so that the Nth thread processes the Nth array element. That may not be the case, and execution may occur in an order which cannot be known a priori
    • You are assuming for_each processes the whole input data set in a single kernel laumch

    Thrust algorithms should be treated as black boxes whose internal operations are undefined and no knowledge of them is required to implement user defined functors. In your example, if you require a sequential index inside a functor, pass a counting iterator. One way to re-write your example would be like this:

    #include "thrust/device_vector.h"
    #include "thrust/for_each.h"
    #include "thrust/tuple.h"
    #include "thrust/iterator/counting_iterator.h"
    
    typedef unsigned int uint;
    const uint N = 222222; 
    struct assign_functor
    {
      template <typename Tuple>
      __device__ 
      void operator()(Tuple t)
      {  
        thrust::get<1>(t) = (float)thrust::get<0>(t);
      }
    };
    
    int main(int argc, char** argv)
    { 
      thrust::device_vector <float> d_float_vec(N);  
      thrust::counting_iterator<uint> first(0);
      thrust::counting_iterator<uint> last = first + N;
    
      thrust::for_each(
        thrust::make_zip_iterator( 
          thrust::make_tuple(first, d_float_vec.begin()) 
        ), 
        thrust::make_zip_iterator( 
          thrust::make_tuple(last, d_float_vec.end())
        ), 
        assign_functor()
      );
    
      std::cout<<d_float_vec[10]<<" "<<d_float_vec[N-2]<<std::endl; 
    }
    

    Here the counting iterator gets passed in a tuple along with the data array, allow the functor access to a sequential index which corresponds to the data array entry it is dealing with.