Search code examples
cudathrust

Thrust reduce not working with non equal input/output types


I'm attempting to reduce the min and max of an array of values using Thrust and I seem to be stuck. Given an array of floats what I would like is to reduce their min and max values in one pass, but using thrust's reduce method I instead get the mother (or at least auntie) of all template compile errors.

My original code contains 5 lists of values spread over 2 float4 arrays that I want reduced, but I've boiled it down to this short example.

struct ReduceMinMax {
    __host__ __device__
    float2 operator()(float lhs, float rhs) {
        return make_float2(Min(lhs, rhs), Max(lhs, rhs));
    }
};

int main(int argc, char *argv[]){

    thrust::device_vector<float> hat(4);
    hat[0] = 3;
    hat[1] = 5;
    hat[2] = 6;
    hat[3] = 1;

    ReduceMinMax binary_op_of_dooooom;
    thrust::reduce(hat.begin(), hat.end(), 4.0f, binary_op_of_dooooom);
}

If I split it into 2 reductions instead it of course works. My question is then: Is it possible to reduce both the min and max in one pass with thrust and how? If not then what is the most efficient way of achieving said reduction? Will a transform iterator help me (and if so, will the reduction then be a one pass reduction?)

Some additional info: I'm using Thrust 1.5 (as supplied by CUDA 4.2.7) My actual code is using reduce_by_key, not just reduce. I found transform_reduce while writing this question, but that one doesn't take keys into account.


Solution

  • As talonmies notes, your reduction does not compile because thrust::reduce expects the binary operator's argument types to match its result type, but ReduceMinMax's argument type is float, while its result type is float2.

    thrust::minmax_element implements this operation directly, but if necessary you could instead implement your reduction with thrust::inner_product, which generalizes thrust::reduce:

    #include <thrust/inner_product.h>
    #include <thrust/device_vector.h>
    #include <thrust/extrema.h>
    #include <cassert>
    
    struct minmax_float
    {
      __host__ __device__
      float2 operator()(float lhs, float rhs)
      {
        return make_float2(thrust::min(lhs, rhs), thrust::max(lhs, rhs));
      }
    };
    
    struct minmax_float2
    {
      __host__ __device__
      float2 operator()(float2 lhs, float2 rhs)
      {
        return make_float2(thrust::min(lhs.x, rhs.x), thrust::max(lhs.y, rhs.y));
      }
    };
    
    float2 minmax1(const thrust::device_vector<float> &x)
    {
      return thrust::inner_product(x.begin(), x.end(), x.begin(), make_float2(4.0, 4.0f), minmax_float2(), minmax_float());
    }
    
    float2 minmax2(const thrust::device_vector<float> &x)
    {
      using namespace thrust;
      pair<device_vector<float>::const_iterator, device_vector<float>::const_iterator> ptr_to_result;
    
      ptr_to_result = minmax_element(x.begin(), x.end());
    
      return make_float2(*ptr_to_result.first, *ptr_to_result.second);
    }
    
    int main()
    {
      thrust::device_vector<float> hat(4);
      hat[0] = 3;
      hat[1] = 5;
      hat[2] = 6;
      hat[3] = 1;
    
      float2 result1 = minmax1(hat);
      float2 result2 = minmax2(hat);
    
      assert(result1.x == result2.x);
      assert(result1.y == result2.y);
    }