I would like to create generic cuda kernel that take a callable object as parameter (like lambda or function) and invoke it.
I am having trouble to pass a device function to a cuda kernel as a parameter.
I have cuda 9.2 with compute capability 3.5. I use gcc 9.3 on Debian 10.
I tried this, compiled with nvcc -arch=sm_35 --expt-extended-lambda main.cu -o test
:
__host__ __device__ void say_hello()
{
printf("Hello World from function!\n");
}
template<class Function>
__global__ void generic_kernel(Function f)
{
f();
}
int main()
{
// this is working
generic_kernel<<<1,1>>>([]__device__(){printf("Hello World from lambda!\n");});
cudaDeviceSynchronize();
// this is not working!
generic_kernel<<<1,1>>>(say_hello);
cudaDeviceSynchronize();
return 0;
}
I expected to see both Hello World from function!
and Hello World from lambda!
but I only see the message from the lambda.
Debian is not a supported environment for any version of CUDA. gcc 9.3 is not a supported tool for CUDA 9.2
There are quite a few questions covering these topics here on the cuda
tag. This answer links to a number of them.
The short version is that it is fundamentally impossible to capture a __device__
function address in host code. A kernel launch (as you have it here) is written in host code; it is host code. Therefore the use of say_hello
there is in host code, and it will refer to the __host__
function pointer/address. That function pointer/address is useless in device code. (Removing the __host__
decorator will not help.)
There are a number of possible solutions, one of which you've already explored. Pass the function wrapped in an object of some sort, and the __device__
lambda when used directly as you have, fits that description.
Another possible fix for the function pointer approach you have that is not working, is to capture the function pointer in device code. It then has to be passed to the host, where it can then be passed back through a kernel launch to device code, and dispatched there. The linked answer above gives a number of ways this can be accomplished.