Search code examples
python-3.xopenclpyopencl

PyOpenCL kernel not being applied to entire array


I wanted to get a feel for Elementwise demo that comes with PyOpenCL and decided to try this out:

from __future__ import absolute_import
from __future__ import print_function
import pyopencl as cl
import pyopencl.array as cl_array
import numpy
from pyopencl.elementwise import ElementwiseKernel

ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)

n = 6

a_gpu = cl.array.to_device(queue,
numpy.arange(1, n, dtype=int))

update_a = ElementwiseKernel(ctx,
"int *a",
"a[i] = 2*a[i]",
"update_a")

print(a_gpu.get())
update_a(a_gpu)
print(a_gpu.get())

Which I expected to print out

[1 2 3 4 5]
[2 4 6 8 10]

but I'm instead getting

[1 2 3 4 5]
[2 4 6 4 5] .

Furthermore, when I try to store the "i" value into the array to see what's going on, I get some really weird values. They are all over the place and some are even negative.

I have been trying to make sense of this for a while now but can't. Can somebody please explain why this is happening? thanks.

Related info: PyOpenCL Version: 2018.2.1, Python Version: 3.6.5, OS: macOS 10.14.1


Solution

  • Your bug lies in the vagueness of the typing of the numpy array, which has led to inconsistent strides along elements of the array on the CPU vs CL-device sides

    Specifying dtype=int is ambiguous, and assumes 8-byte np.int64 or long elements. The matching type on the CL-device side should be long *a_in for np.int64.

    If you want to stick with 4-byte integers, specify dtype=np.int32 on the CPU side and int *a_in on the CL-device side.

    Takeaway: Always specify your numpy array types with clarity, e.g., dtype=np.int64. And check for a precise match on the CL-device side.