How to correctly assign double8 type

I am trying to assign a double8 type, ultimately for some AVX2 parallelisation using pyopencl. I am making code to find the dot product efficiently between two vectors, va and vb, and return the result vc.

Code is below:

# create context 
ctx = cl.create_some_context()
mf = cl.mem_flags

# define vectors to dot product 
va=np.array([1, 2, 3, 4, 5, 6, 7, 8],dtype=np.float32)
vb=np.array([1, 2, 3, 4, 5, 6, 7, 8],dtype=np.float32)

# create memory buffers for input vectors and output buffer
va_buf=cl.Buffer(ctx,mf.READ_ONLY|mf.COPY_HOST_PTR,hostbuf=va)
vb_buf=cl.Buffer(ctx,mf.READ_ONLY|mf.COPY_HOST_PTR,hostbuf=vb)
vc_buf=cl.Buffer(ctx,mf.WRITE_ONLY,vb.nbytes)

# define my kernel / C function that will perform dot product 
kernel="""
__kernel void adder(const __global float* va,
                const __global float* vb,
                __global float* vc
)
{
double8 v1 = (va[0],va[1],va[2],va[3],va[4],va[5],va[6],va[7]);
double8 v2 = (vb[0],vb[1],vb[2],vb[3],vb[4],vb[5],vb[6],vb[7]);

vc = v1.s0*v2.s0
    +v1.s1*v2.s1
    +v1.s2*v2.s2
    +v1.s3*v2.s3
    +v1.s4*v2.s4
    +v1.s5*v2.s5
    +v1.s6*v2.s6
    +v1.s7*v2.s7);
}

"""

# run the kernel 
adder=cl.Program(ctx,kernel).build().adder
event=adder(queue,va.shape,None,va_buf,vb_buf,vc_buf)
event.wait()

# create empty array an copy output buffer to it 
vd = np.zeros(va.shape)
cl.enqueue_copy(queue,vd,vc_buf)

My error is:

RuntimeError: clBuildProgram failed: BUILD_PROGRAM_FAILURE - clBuildProgram failed:         
BUILD_PROGRAM_FAILURE - clBuildProgram failed: BUILD_PROGRAM_FAILURE

Build on <pyopencl.Device 'pthread-Intel(R) Xeon(R) CPU E5-2673 v4 @ 2.30GHz' on 'Portable Computing    Language' at 0x5594de2e3b60>:

error: /home/nbuser/.cache/pocl/kcache/tempfile-27-53-7c-c7-a0.cl:10:4: assigning to '__global float *' from incompatible type 'double'

Solution

I don't really know anything about pyopencl, but I assume the kernels are exactly like regular OpenCL kernels. Your problem isn't with assignment of a double8 type, rather the assignment of value vc. you have vc as a __global float*, a pointer type. See how you treated va & vb as arrays and accessed their elements with [index]? The same is true for vc.Since your vc is only intended to store a single value, you can do

vc[0] = ...

or a pointer derefrence

*cv = ...

So what you should do is this instead:

*vc = v1.s0*v2.s0
    +v1.s1*v2.s1
    +v1.s2*v2.s2
    +v1.s3*v2.s3
    +v1.s4*v2.s4
    +v1.s5*v2.s5
    +v1.s6*v2.s6
    +v1.s7*v2.s7);

You may want to read a bit on C/C++ type pointers, OpenCL pointers are basically the same, with some differences.