Search code examples
coptimizationopenmpconways-game-of-life

Game of Life with OpenMP


I did a sequential version of game of life, but now I need to make a parallel version of my code using OpenMP, but I am having some issues with it. If anyone could help me, it would be very nice. Thks. Here's my sequential code:

// Swapping the two grids   
 #define SWAP_BOARDS( b1, b2 )  do { \
 char* temp = b1; \
 b1 = b2; \
 b2 = temp; \
 } while(0)

// Simplifying access to grid elements
   #define BOARD( G, X, Y )  ((G)[NC*(X)+(Y)])

 char* sequential_game_of_life (char* outgrid, char* ingrid, 
       const int nrows, const int ncols, const int gens_max) {

  const int NC = ncols;
  int curgen, i, j;

 for (curgen = 0; curgen < gens_max; curgen++)
   {

  for (i = 0; i < nrows; i++)
{
  for (j = 0; j < ncols; j++)
    {
      const int inorth = mod (i-1, nrows);
      const int isouth = mod (i+1, nrows);
      const int jwest = mod (j-1, ncols);
      const int jeast = mod (j+1, ncols);

      const char neighbor_count = 
    BOARD (ingrid, inorth, jwest) + 
    BOARD (ingrid, inorth, j) + 
    BOARD (ingrid, inorth, jeast) + 
    BOARD (ingrid, i, jwest) +
    BOARD (ingrid, i, jeast) + 
    BOARD (ingrid, isouth, jwest) +
    BOARD (ingrid, isouth, j) + 
    BOARD (ingrid, isouth, jeast);

      BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j));
    }
}
  SWAP_BOARDS( outgrid, ingrid );
}
  return outgrid;
 }

I know that I have to parallel those 3 for's, but I cant see how to do that.


Solution

  • I think the outer loop can not be parallelized, because input to each generation is the previous generation, so it has a sequential formula (at least you can not do it with minor changes!)

    In case of nested loops which traverse a matrix or something like that, I prefer to run a single loop from 0 to ncol*nrow (in your case) and find i and j from loop index.

    like this:

    // because you are running a parallel codes multiple times in a loop,
    // it would be better to make the thread swarm first and schedule the
    // tasks in each loop iteration, to avoid multiple creation and destruction
    // of working threads
    #pragma omp parallel
    for (curgen = 0; curgen < gens_max; curgen++)
    {
        #pragma omp for
        for (t = 0; t < nrows*ncols; t++)
        {
            int i = t / ncols;
            int j = t % ncols;
            const int inorth = mod (i-1, nrows);
            const int isouth = mod (i+1, nrows);
            const int jwest = mod (j-1, ncols);
            const int jeast = mod (j+1, ncols);
    
            const char neighbor_count = 
                BOARD (ingrid, inorth, jwest) + 
                BOARD (ingrid, inorth, j) + 
                BOARD (ingrid, inorth, jeast) + 
                BOARD (ingrid, i, jwest) +
                BOARD (ingrid, i, jeast) + 
                BOARD (ingrid, isouth, jwest) +
                BOARD (ingrid, isouth, j) + 
                BOARD (ingrid, isouth, jeast);
    
            BOARD(outgrid, i, j) = alivep (neighbor_count, BOARD (ingrid, i, j));
        }
        SWAP_BOARDS( outgrid, ingrid );
    }
    

    I ran this code on my laptop with Dual Core 2.53 GHz CPU on a 1000x1000 matrix over 1000 generations, and got 69% speed up.