Let
import pyopencl as cl
import pyopencl.array as cl_array
import numpy
a = numpy.random.rand(50000).astype(numpy.float32)
mf = cl.mem_flags
What is the difference between
a_gpu = cl.Buffer(self.ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a)
and
a_gpu = cl_array.to_device(self.ctx, self.queue, a)
?
And what is the difference between
result = numpy.empty_like(a)
cl.enqueue_copy(self.queue, result, result_gpu)
and
result = result_gpu.get()
?
Buffers are CL's version of malloc
, while pyopencl.array.Array
is a workalike of numpy arrays on the compute device.
So for the second version of the first part of your question, you may write a_gpu + 2
to get a new arrays that has 2 added to each number in your array, whereas in the case of the Buffer
, PyOpenCL only sees a bag of bytes and cannot perform any such operation.
The second part of your question is the same in reverse: If you've got a PyOpenCL array, .get()
copies the data back and converts it into a (host-based) numpy array. Since numpy arrays are one of the more convenient ways to get contiguous memory in Python, the second variant with enqueue_copy
also ends up in a numpy array--but note that you could've copied this data into an array of any size (as long as it's big enough) and any type--the copy is performed as a bag of bytes, whereas .get()
makes sure you get the same size and type on the host.
Bonus fact: There is of course a Buffer underlying each PyOpenCL array. You can get it from the .data
attribute.