Search code examples
pythonlistcudamemcpypycuda

How to handle a python list with PyCUDA?


I guess this is a rather easy question for an expert, yet I can't find any answers in the net. Given a simple case:

The problem:

listToProcess = []
for i in range(0, 10):
    listToProcess.append(i)

This list shall be transfered to the GPU, for further processing. I would then go on with a common cuda procedure for mem-copy:

import sys
import pycuda.autoinit
import pycuda.driver as cuda

listToProcess_gpu = cuda.mem_alloc(sys.getsizeof(listToProcess))
cuda.memcpy_htod(listToProcess_gpu, listToProcess)

and afterwards call the kernel itself. However lists have no buffer interfaces, so that memcpy_htod() crashes. I tried different approaches too but in the end it leads to

The questions

  • How does one transfer a list with its content from a python program to the GPU kernel ?
  • How does one specify the data type of the list (i.e. a list of floats, or ints, or ...) for the kernel?

Solution

  • The only way to do this is to create an object which supports the buffer protocol from the list, and pass that new object to PyCUDA. In practice, that probably means creating a numpy or PyCUDA native GPUarray array from the list and using that instead:

    import sys
    import pycuda.autoinit
    import pycuda.driver as cuda
    import numpy as np
    
    listToProcess = []
    for i in range(0, 10):
        listToProcess.append(i)
    
    l2p = np.array(listToProcess, dtype=np.int32)
    listToProcess_gpu = cuda.mem_alloc(l2p.nbytes)
    cuda.memcpy_htod(listToProcess_gpu, l2p)
    

    This would imply that your list was homogeneous with respect to type. A numpy array with dtype of object won't work.

    You could, of course, put on a hair shirt and roll your own object with buffer protocol support using ctypes, but that would be reinventing the wheel given what PyCUDA supports natively.