Search code examples
cudanumbanumba-pro

NumaPro Cuda Device Function - Return multiple Arrays and local memory


Does anyone know what the correct syntax for the cuda.jit decorator is if you want to write a device function that returns multiple arrays?

If my device function should return one float and had two integer parameters my decorator would be:

@cuda.jit('float64(int64,int64)', device=True, inline=True)

Now I want my function to take two integer paramters and two floats and return 2 arrays of floats and 2 arrays of integers, all of the same length (between 3 and 5) which depends on the input arguments. How do I do that? Would that be correct:

@cuda.jit(restype=[float64[:], int64[:], float64[:], int64[:]], argtypes=[int64, int64, float64, float64], device=True, inline = True)

Also in my function I would create the arrays I want to return by using: cuda.local.array() Since I use inline=True I would suspect that this will work and the arrays will be only accessable by the respective thread, right?


Solution

  • Now I want my function to take two integer parameters and two floats and return 2 arrays of floats and 2 arrays of integers

    What you are really saying there is you want your JIT kernel to return a tuple (of two arrays). Unfortunately, in the nopython frontend, I don't believe that is legal. There is no object support in nopython, so you can't instantiate and return a tuple object.

    Also in my function I would create the arrays I want to return by using: cuda.local.array()

    Unfortunately that isn't supported either. It is only legal to return an array which was passed as an argument to the function.