Search code examples
parallel-processingsycldpc++intel-oneapi

Incorrect results when runnig SYCL code. while trying to parallize loop


I'm new to this parallel programming field. I'm trying to parallelize below serial code in SYCL. But when I try to run the code, I'm getting incorrect results.

Please find the serial code, SYCL code and output screenshot below. Please help me with this.

Thanks in advance.

//Serial code

for(int i = 0; i < N; i++)
        a[i]=pow(p+i,q-i);

//Paralle code

queue defaultqueue;
        buffer<unsigned long long int,1> buf(a, range<1>(N));
        defaultqueue.submit([&](handler &cgh){
            auto bufacc = buf.get_access<access::mode::read_write>(cgh);
            cgh.parallel_for<class single_dim>(range<1>(N), [=](nd_item<1> it){
                auto idx = it.get_global_linear_id();
                unsigned long long int x;
                x=pow(p+idx,q-idx);
                bufacc[idx] += x;
            });
        });

Output of parallel code


Solution

  • Kernel calls in SYCL are non-blocking i.e., CPU continues its execution after calling the kernel without waiting for the kernel to finish

    This may lead to data inconsistency especially in your case, since you're accessing the data immediately after the kernel launch. This will be more predominant when a kernel does heavy time-taking computations

    So, you may try using a defaultqueue.wait() after kernel call

    Hope this will resolve your issue