Search code examples

openCL Kernel to calculate Pi is not correct value

Good day,

I have an openCL kernel that is using the Leibniz formula to calculate pi. Currently my issue is that the value I get back isn't pi, but instead just 4.

__kernel void calculatePi(int numIterations, __global float *outputPi,
                          __local float* local_result, int numWorkers)
    __private const uint gid = get_global_id(0);
    __private const uint lid = get_local_id(0);
    __private const uint offset = numIterations*gid*2; 
    __private float sum = 0.0f;

    // Have the first worker initialize local_result
    if (gid == 0)
        for (int i = 0; i < numWorkers; i++)
            local_result[i] = 0.0f;

    // Have all workers wait until this is completed

    // Have each worker calculate their portion of pi
    // This is a private value
    for (int i=0; i<numIterations; i++) 
        if (i % 2 == 0)
            sum += 1 / (1 + 2*i + offset);
            sum -= 1 / (1 + 2*i + offset);

    // Have each worker move their value to the appropriate
    // local_result slot so that the first worker can see it
    // when reducing next
    local_result[gid] = sum;    

    // Make sure all workers complete this task before continuing

    // Have the first worker add up all of the other worker's values
    // to get the final value
    if (lid == 0)
        outputPi[0] = 0;
        for (int i = 0; i < numWorkers; i++)
            outputPi[0] += local_result[i]; 

        outputPi[0] *= 4;

I've steered all of my inputs to my output to verify that they are what I expect. numIterations is 16 and numWorkers is also 16.

When sum is calculated then for the first worker, I would expect the sum to be 1 - 1/3 + 1/5 - 1/7 + 1/9 - 1/11 + 1/13 - 1/15 + 1/17 - 1/19 + 1/21 - 1/23 + 1/25 - 1/27 + 1/29 - 1/31

Using this calculator for the first 16 times, I expect the result to be around 3.2 :

If I modify my last bit of code to be this so that I can look at a worker's calculated value of "sum":

    // Have the first worker add up all of the other worker's values
    // to get the final value
    if (lid == 0)
        outputPi[0] = sum * 4;

Then the value returned for the first worker is 4 instead of the expected 3.2

Modifying to any other number except lid == 0, all other workers are reporting their sum as 0. So my question is why is that the calculated value? Am I doing something wrong with my sum variable? This should be a private variable and the for loop should be sequential from my understanding for each worker but numerous loops are executed in parallel based on the number of workers.

Here's a link to my github that has the kernel and main code uploaded.



  • you are performing integral divisions in your code, should be floats:

    if (i % 2 == 0)
       sum += 1. / (1 + 2*i + offset); // notice the 1.
       sum -= 1. / (1 + 2*i + offset);