I am trying to assign a double8 type, ultimately for some AVX2 parallelisation using pyopencl. I am making code to find the dot product efficiently between two vectors, va and vb, and return the result vc.
Code is below:
# create context
ctx = cl.create_some_context()
mf = cl.mem_flags
# define vectors to dot product
va=np.array([1, 2, 3, 4, 5, 6, 7, 8],dtype=np.float32)
vb=np.array([1, 2, 3, 4, 5, 6, 7, 8],dtype=np.float32)
# create memory buffers for input vectors and output buffer
va_buf=cl.Buffer(ctx,mf.READ_ONLY|mf.COPY_HOST_PTR,hostbuf=va)
vb_buf=cl.Buffer(ctx,mf.READ_ONLY|mf.COPY_HOST_PTR,hostbuf=vb)
vc_buf=cl.Buffer(ctx,mf.WRITE_ONLY,vb.nbytes)
# define my kernel / C function that will perform dot product
kernel="""
__kernel void adder(const __global float* va,
const __global float* vb,
__global float* vc
)
{
double8 v1 = (va[0],va[1],va[2],va[3],va[4],va[5],va[6],va[7]);
double8 v2 = (vb[0],vb[1],vb[2],vb[3],vb[4],vb[5],vb[6],vb[7]);
vc = v1.s0*v2.s0
+v1.s1*v2.s1
+v1.s2*v2.s2
+v1.s3*v2.s3
+v1.s4*v2.s4
+v1.s5*v2.s5
+v1.s6*v2.s6
+v1.s7*v2.s7);
}
"""
# run the kernel
adder=cl.Program(ctx,kernel).build().adder
event=adder(queue,va.shape,None,va_buf,vb_buf,vc_buf)
event.wait()
# create empty array an copy output buffer to it
vd = np.zeros(va.shape)
cl.enqueue_copy(queue,vd,vc_buf)
My error is:
RuntimeError: clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed:
BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE
Build on <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz' on 'Portable Computing Language' at 0x5594de2e3b60>:
error: /home/nbuser/.cache/pocl/kcache/tempfile-27-53-7c-c7-a0.cl:10:4: assigning to '__global float *' from incompatible type 'double'
I don't really know anything about pyopencl, but I assume the kernels are exactly like regular OpenCL kernels. Your problem isn't with assignment of a double8 type, rather the assignment of value vc. you have vc as a __global float*, a pointer type. See how you treated va & vb as arrays and accessed their elements with [index]? The same is true for vc.Since your vc is only intended to store a single value, you can do
vc[0] = ...
or a pointer derefrence
*cv = ...
So what you should do is this instead:
*vc = v1.s0*v2.s0
+v1.s1*v2.s1
+v1.s2*v2.s2
+v1.s3*v2.s3
+v1.s4*v2.s4
+v1.s5*v2.s5
+v1.s6*v2.s6
+v1.s7*v2.s7);
You may want to read a bit on C/C++ type pointers, OpenCL pointers are basically the same, with some differences.