If I use
float sum = thrust::transform_reduce(d_a.begin(), d_a.end(), conditional_operator(), 0.f, thrust::plus<float>());
I get the sum of all elements meeting a condition provided by conditional_operator()
, as in Conditional reduction in CUDA.
But what can I sum only the elements d_a[0]
, d_a[2]
, d_a[4]
, d_a[6]
, ..... ?
I thought of changing the conditional operator, but it works on on elements in the array without any reference to the index.
What can I do for that?
There are two approaches I can think of for solving this sort of problem:
It might be worth implemented both and benchmarking them to see which approach is faster.