Search code examples
openclpyopenclcl

I get wrong output when adding arrays with opencl


I'm trying to sum 2 arrays with pyopencl, but I get strange numbers in output.

Code:

def sum_arrays_with_cl(array1, array2):
"""
    Sums 2 arrays with GPU. 
"""
ctx = cl.create_some_context()
queue = cl.CommandQueue(ctx)
mf = cl.mem_flags
a_array = numpy.array(array1)
b_array = numpy.array(array2)
a_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=a_array)
b_buf = cl.Buffer(ctx, mf.READ_ONLY | mf.COPY_HOST_PTR, hostbuf=b_array)
dest_buf = cl.Buffer(ctx, mf.WRITE_ONLY, b_array.nbytes)
prg = cl.Program(ctx, """
__kernel void sum(__global const float *a,
__global const float *b, __global float *res_g)
{
  int gid = get_global_id(0);
  res_g[gid] = a[gid] + b[gid];
}
""").build()
prg.sum(queue, a_array.shape, None, a_buf, b_buf, dest_buf)
a_plus_b = numpy.empty_like(a_array)
cl.enqueue_copy(queue, a_plus_b, dest_buf).wait()
return list(a_plus_b)

a = [1 for dummy in range(10)] b = [i for i in range(10)]

print sum_arrays_with_cl(a,b)

output:

[0, 0, 0, 0, 0, 5, 6, 7, 8, 9]

What I'm doing wrongly?


Solution

  • You need to be explicit about the types of your arrays, otherwise the arrays created on the host won't match what the device expects. Since your kernel is expecting 32-bit floating point data, you can create your arrays like this:

    a_array = numpy.array(array1).astype(numpy.float32)
    b_array = numpy.array(array2).astype(numpy.float32)