I am a recreational pythonista who just got into pyCUDA. I am trying to figure out how to implement a linear interpolation (lerp) using pyCUDA. The CUDA CG function is: http://http.developer.nvidia.com/Cg/lerp.html
My ultimate goal is a bilinear interpolation in pycuda from a set of weighted random points. I've never programmed C, or CUDA for that matter, and am learning as I go.
This is how far I've gotten:
import pycuda.autoinit
import pycuda.driver as drv
import pycuda.compiler as comp
lerpFunction = """__global__ float lerp(float a, float b, float w)
{
return a + w*(b-a);
}"""
mod = comp.SourceModule(lerpFunction) # This returns an error telling me a global must return a void. :(
Any help on this would be fantastic!
The error message is pretty explicit - CUDA kernels cannot return values, they must be declared void
, and modifiable arguments passed as pointers. It would make more sense for your lerp implementation to be declared as a device function like this:
__device__ float lerp(float a, float b, float w)
{
return a + w*(b-a);
}
and then called from inside a kernel for each value that requires interpolation. Your lerp function lacks a lot of "infrastructure" to be a useful CUDA kernel.
EDIT: A really basic kernel along the same lines might look something like this:
__global__ void lerp_kernel(const float *a, const float *b, const float w, float *y)
{
int tid = threadIdx.x + blockIdx.x*blockDim.x; // unique thread number in the grid
y[tid] = a[tid] + w*(b[tid]-a[tid]);
}