Search code examples
cmultithreadingopenmp

OpenMP: Having threads execute a for loop in order


I'd like to run something like the following:

for (int index = 0; index < num; index++)

I'd want to run the for loop with four threads, with the threads executing in the order: 0,1,2,3,4,5,6,7,8, etc... That is, for the threads to be working on index =n,(n+1),(n+2),(n+3) (in any particular ordering but always in this pattern), I want iterations of index = 0,1,2,...(n-1) to already be finished. Is there a way to do this? Ordered doesn't really work here as making the body an ordered section would basically remove all parallelism for me, and scheduling doesn't seem to work because I don't want a thread to be working on threads k->k+index/4. Thanks for any help!


Solution

  • You can do this with, not a parallel for loop, but a parallel region that manages its own loop inside, plus a barrier to make sure all running threads have hit the same point in it before being able to continue. Example:

    #include <stdatomic.h>
    #include <stdio.h>
    #include <omp.h>
    
    int main()
    {
      atomic_int chunk = 0;
      int num = 12;
      int nthreads = 4;
      
      omp_set_num_threads(nthreads);
      
    #pragma omp parallel shared(chunk, num, nthreads)
      {
        for (int index; (index = atomic_fetch_add(&chunk, 1)) < num; ) {
          printf("In index %d\n", index);
          fflush(stdout);
    #pragma omp barrier
    
          // For illustrative purposes only; not needed in real code
    #pragma omp single
          {
            puts("After barrier");
            fflush(stdout);
          }
        }
      }
    
      puts("Done");
      return 0;
    }
    

    One possible output:

    $ gcc -std=c11 -O -fopenmp -Wall -Wextra demo.c
    $ ./a.out
    In index 2
    In index 3
    In index 1
    In index 0
    After barrier
    In index 4
    In index 6
    In index 5
    In index 7
    After barrier
    In index 10
    In index 9
    In index 8
    In index 11
    After barrier
    Done