I'm new to this parallel programming field. I'm trying to parallelize below serial code in SYCL. But when I try to run the code, I'm getting incorrect results.
Please find the serial code, SYCL code and output screenshot below. Please help me with this.
Thanks in advance.
//Serial code
for(int i = 0; i < N; i++)
a[i]=pow(p+i,q-i);
//Paralle code
queue defaultqueue;
buffer<unsigned long long int,1> buf(a, range<1>(N));
defaultqueue.submit([&](handler &cgh){
auto bufacc = buf.get_access<access::mode::read_write>(cgh);
cgh.parallel_for<class single_dim>(range<1>(N), [=](nd_item<1> it){
auto idx = it.get_global_linear_id();
unsigned long long int x;
x=pow(p+idx,q-idx);
bufacc[idx] += x;
});
});
Kernel calls in SYCL are non-blocking i.e., CPU continues its execution after calling the kernel without waiting for the kernel to finish
This may lead to data inconsistency especially in your case, since you're accessing the data immediately after the kernel launch. This will be more predominant when a kernel does heavy time-taking computations
So, you may try using a defaultqueue.wait() after kernel call
Hope this will resolve your issue