Search code examples
openmp

Array overwriting(?) in OpenMP


As an introduction to OpenMP, I'm trying to print a list of prime numbers.

// compile: gcc primes.c -fopenmp -o primes
#include <stdio.h>  //printf()
#include <stdlib.h> //atoi() and malloc()

//#define ARR_MAX 1000

int isPrime(int n) {
    if (n < 2)
        return 0;
    for (int i = 2; i < n; i++) {
        if (n % i == 0)
            return 0;
    }
    return 1;
}

int main(int argc, char *argv[]) {
    int n;
    if (argc < 2 || !(n = atoi(argv[1]))) // || n >= ARR_MAX)
        return 0;

    // int arr[ARR_MAX];
    int *arr = (int *)malloc(sizeof(int) * (n + 1));
    int counter = 0;

    #pragma omp parallel for shared(arr)
    for (int i = 0; i <= n; i++) {
        if (isPrime(i)) {
            arr[counter] = i;
            printf("%2d  ", arr[counter]);
            counter++;
        }
    }
    printf("\n");

    for (int i = 0; i < counter; i++) {
        printf("%2d  ", arr[i]);
    }
    printf("\n");
    free(arr);
}

I print the results twice.

  • When they're first calculated and written, everything looks good.
  • Later on, when I try to print what was saved, things have gone wrong somehow.

Here's the same example, run 6 times:

$ ./primes 55
 2   3   5   7  11  13  17  19  23  43  47  53  29  31  37  41
43   3   5   7  11  13   1  19  23   0  47  53  33605084  31  37  41
$ ./primes 55
29  31  37  41   2   3   5   7  11  13  43  47  53  17  19  23
43  31  37  41  43   3   5   7  11  13   9  47  53   0  19  23
$ ./primes 55
 2   3   5   7  11  13  29  31  37  41  17  19  23  43  47  53
43   3   5   7  11  13   1  31  37  41   9  19  23   0  47  53
$ ./primes 55
29  31  37  41  43  47  53   2   3   5   7  11  13  17  19  23
43  31  37  41  43  47  53   0   3   5   7  11  13   0  19  23
$ ./primes 55
43  47  53  17  19  23   2   3   5   7  11  13  29  31  37  41
43  47  53   0  19  23   1   3   5   7  11  13  33605084  31  37  41
$ ./primes 55
29  31  37  41   2   3   5   7  11  13  43  47  53  17  19  23
43  31  37  41  43   3   5   7  11  13   9  47  53   0  19  23

I'm currently using malloced array, just to avoid having a max. But I also tried it with a static array. The code is still there, commented out, if you want to inspect it or try it. The same problem was present then.


Solution

  • There is a race condition in your code: you cannot safely do counter++ and arr[counter] = i in this parallel loop because multiple thread can find primes at the same time.

    A simple way to fix this is to use a critical section (with #pragma omp critical so the content of the condition is executed by one thread at a time and counter is protected. This is not so bad when n is large since there are not a lot of prime number compared to classical numbers (AFAIK O(n/log(n)) prime numbers). This solution is certainly the best on platforms with few cores.

    An alternative solution is to use atomics (wit #pragma omp atomic), but they are not far better than a lock here on most mainstream architecture.

    Another solution is to write the prime number found in a thread-local buffer and then copy it in a shared buffer. That being said, the order is not preserved. You can restore the order using a sort, but sorting algorithms tends to be a bit slow. A parallel sort can be relatively fast, but hard to implement efficiently.

    A last solution is to allocate a 8-bit mask array of size n initialized to 0 and flag prime numbers by setting the value to 1. This solution is certainly the best on many-core platforms. You can then travel the mask array to find the prime numbers. This can even be done in parallel in two steps: 1 parallel reduction to count the number of values and one to fill the final array containing prime numbers.

    Also please note that the order of printf calls is not deterministic in a parallel loop.