I'm receiving the compiler error
static_assert failed: 'Attempt to use an extended __device__ lambda in a context that requires querying its return type in host code. Use a named function object, a __host__ __device__ lambda, or cuda::proclaim_return_type instead.'
when compiling this code:
thrust::device_vector<float2> a;
thrust::device_vector<float> b;
float param1, param2;
float2 param3;
thrust::transform_reduce(
thrust::make_zip_iterator(thrust::make_tuple(a.begin(), b.begin())),
thrust::make_zip_iterator(thrust::make_tuple(a.end(), b.end())),
[param1, param2, param3] __device__ (thrust::tuple<float2, float> const& tuple)
{
/* do something and return a float2 */
},
float2{},
[] __device__ (float2 const& first, float2 const& second)
{
float2 result{};
result.x = first.x + second.x;
result.y = first.y + second.y;
return result;
});
How can I rewrite this code so that it compiles? I'm new to CUDA/Thrust and its quite obscure to me how I need to deal with this problem.
(BTW, do I really need the complicaed "transformation" function here? thrust::plus<float2>
doesn't work.)
How can I rewrite this code so that it compiles?
A possible solution seems to be indicated in the assert message you posted in your question:
static_assert failed: 'Attempt to use an extended
__device__
lambda in a context that requires querying its return type in host code. Use a named function object, a__host__ __device__
lambda, or cuda::proclaim_return_type instead.'
When I change each lambda to be decorated with __host__ __device__
instead of just __device__
then the code compiles for me on CUDA 12.1
(BTW, do I really need the complicaed "transformation" function here? thrust::plus doesn't work.)
CUDA doesn't provide arithmetic operators for the vector types supplied by CUDA, and AFAIK thrust doesn't either:
then x+y must be defined
So you'll need to provide your own definition to add float2
types, i.e.
__host__ __device__ __forceinline__
float2 operator+(float2 left, float2 right) noexcept {
return float2{left.x + right.x,
left.y + right.y};
}
will allow you to use thrust::plus<float2>
.