I wanted to get a feel for Elementwise demo that comes with PyOpenCL and decided to try this out:
from __future__ import absolute_import
from __future__ import print_function
import pyopencl as cl
import pyopencl.array as cl_array
import numpy
from pyopencl.elementwise import ElementwiseKernel
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
n = 6
a_gpu = cl.array.to_device(queue,
numpy.arange(1, n, dtype=int))
update_a = ElementwiseKernel(ctx,
"int *a",
"a[i] = 2*a[i]",
"update_a")
print(a_gpu.get())
update_a(a_gpu)
print(a_gpu.get())
Which I expected to print out
[1 2 3 4 5]
[2 4 6 8 10]
but I'm instead getting
[1 2 3 4 5]
[2 4 6 4 5] .
Furthermore, when I try to store the "i" value into the array to see what's going on, I get some really weird values. They are all over the place and some are even negative.
I have been trying to make sense of this for a while now but can't. Can somebody please explain why this is happening? thanks.
Related info: PyOpenCL Version: 2018.2.1, Python Version: 3.6.5, OS: macOS 10.14.1
Your bug lies in the vagueness of the typing of the numpy array, which has led to inconsistent strides along elements of the array on the CPU vs CL-device sides
Specifying dtype=int
is ambiguous, and assumes 8-byte np.int64
or long
elements.
The matching type on the CL-device side should be long *a_in
for np.int64
.
If you want to stick with 4-byte integers, specify dtype=np.int32
on the CPU side and int *a_in
on the CL-device side.
Takeaway: Always specify your numpy array types with clarity, e.g., dtype=np.int64
. And check for a precise match on the CL-device side.