Search code examples
pythonnumpypyopencl

NumPy array of arrays to PyOpenCL array of vecs


I have a NumPy array which contains arrays:

import numpy as np
import pyopencl as cl
someArray = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])

Now, I'd like to convert this array to an OpenCL array of vec4s in order to do something with it. For example:

context = cl.create_some_context()
queue = cl.CommandQueue()
program = cl.Program("""
    __kernel void multiplyByTwo(__global const float32* someArrayAsOpenCLType, __global float32* result) {
        gid = get_global_id(0);
        vector = someArrayAsOpenCLType[gid];
        result[gid] = vector * 2;
    }
""").build()

someArrayAsOpenCLType = # something with someArray
result = # some other thing
program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, result)

What do I do to convert someArray to someArrayAsOpenCLType?


Solution

  • The data in someArray is stored in host's memory and these data has to be copied to a device's buffer memory (someArrayAsOpenCLType).

    The kernel executes on device and stores the results on a device buffer (pre-allocated: resultAsOpenCLType).

    After the execution, the program may get the results from device's buffer back to host memory (e.g.: cl.enqueue_copy(queue, result, resultAsOpenCLType)).

    Follow a simple example (but maybe there are other ways to do this):

    import numpy as np
    import pyopencl as cl
    
    # Context
    ctx = cl.create_some_context()
    # Create queue
    queue = cl.CommandQueue(ctx)
    
    someArray = np.array([
        [1, 2, 3, 4],
        [5, 6, 7, 8]
    ]).astype(np.float32)
    
    print ""
    print("Input:")
    print(someArray)
    print("------------------------------------")
    
    # Get mem flags
    mf = cl.mem_flags
    
    # Create a read-only buffer on device and copy 'someArray' from host to device
    someArrayAsOpenCLType = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=someArray)
    
    # Create a write-only buffer to get the result from device
    resultAsOpenCLType = cl.Buffer(ctx, mf.WRITE_ONLY, someArray.nbytes)
    
    # Creates a kernel in context
    program = cl.Program(ctx, """
    __kernel void multiplyByTwo(__global const float4 *someArrayAsOpenCLType, __global float4 *resultAsOpenCLType) {
            int gid = get_global_id(0);
    
            float4 vector = someArrayAsOpenCLType[gid];
            resultAsOpenCLType[gid] =  vector * (float) 2.0;
    }
    """).build()
    
    # Execute
    program.multiplyByTwo(queue, someArray.shape, None, someArrayAsOpenCLType, resultAsOpenCLType)
    
    # Creates a buffer for the result (host memory)
    result = np.empty_like(someArray)
    
    # Copy the results from device to host
    cl.enqueue_copy(queue, result, resultAsOpenCLType)
    
    print("------------------------------------")
    print("Output")
    # Show the result
    print (result)
    

    After the execution (with option 0):

    Choose platform:
    [0] <pyopencl.Platform 'Intel(R) OpenCL' at 0x858ea0>
    [1] <pyopencl.Platform 'Experimental OpenCL 2.0 CPU Only Platform' at 0x872880>
    [2] <pyopencl.Platform 'NVIDIA CUDA' at 0x894a80>
    Choice [0]:
    Set the environment variable PYOPENCL_CTX='' to avoid being asked again.
    
    Input:
    [[ 1.  2.  3.  4.]
     [ 5.  6.  7.  8.]]
    ------------------------------------
    C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: Built kernel retrieved from cache. Original from-sour
    ce build had warnings:
    Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
    
    Compilation started
    Compilation done
    Linking started
    Linking done
    Device build started
    Device build done
    Kernel <multiplyByTwo> was not vectorized
    Done.
      warn(text, CompilerWarning)
    C:\Python27\lib\site-packages\pyopencl\__init__.py:59: CompilerWarning: From-binary build succeeded, but resulted in non-empt
    y logs:
    Build on <pyopencl.Device 'Intel(R) Core(TM) i7-5820K CPU @ 3.30GHz' on 'Intel(R) OpenCL' at 0x86ca30> succeeded, but said:
    
    Device build started
    Device build done
    Reload Program Binary Object.
      warn(text, CompilerWarning)
    ------------------------------------
    Output
    [[  2.   4.   6.   8.]
     [ 10.  12.  14.  16.]]
    

    Some tutorials about OpenCL on Intel's site:

    Intel - OpenCL™ Tutorials