Pyopencl: difference between to_device and Buffer

Let

import pyopencl as cl
import pyopencl.array as cl_array
import numpy
a = numpy.random.rand(50000).astype(numpy.float32)
mf = cl.mem_flags

What is the difference between

a_gpu = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)

and

a_gpu = cl_array.to_device(self.ctx, self.queue, a)

And what is the difference between

result =  numpy.empty_like(a)
cl.enqueue_copy(self.queue, result, result_gpu)

and

result = result_gpu.get()

Solution

Buffers are CL's version of malloc, while pyopencl.array.Array is a workalike of numpy arrays on the compute device.

So for the second version of the first part of your question, you may write a_gpu + 2 to get a new arrays that has 2 added to each number in your array, whereas in the case of the Buffer, PyOpenCL only sees a bag of bytes and cannot perform any such operation.

The second part of your question is the same in reverse: If you've got a PyOpenCL array, .get() copies the data back and converts it into a (host-based) numpy array. Since numpy arrays are one of the more convenient ways to get contiguous memory in Python, the second variant with enqueue_copy also ends up in a numpy array--but note that you could've copied this data into an array of any size (as long as it's big enough) and any type--the copy is performed as a bag of bytes, whereas .get() makes sure you get the same size and type on the host.

Bonus fact: There is of course a Buffer underlying each PyOpenCL array. You can get it from the .data attribute.