Search code examples
c++openmp

OpenMP parallel calculating for loop indices


My parallel programming class has the program below demonstrating how to use the parallel construct in OpenMP to calculate array bounds for each thread to be use in a for loop.

#pragma omp parallel
{
  int id = omp_get_thread_num();
  int p = omp_get_num_threads();
  int start = (N * id) / p;
  int end = (N * (id + 1)) / p;
  if (id == p - 1) end = N;
  for (i = start; i < end; i++)
  {
    A[i] = x * B[i];
  }
}

My question is, is the if statement (id == p - 1) necessary? From my understanding, if id = p - 1, then end will already be N, thus the if statement is not necessary. I asked in my class's Q&A board, but wasn't able to get a proper answer that I understood. Assumptions are: N is the size of array, x is just an int, id is between 0 and p - 1.


Solution

  • You are right. Indeed, (N * ((p - 1) + 1)) / p is equivalent to (N * p) / p assuming p is strictly positive (which is the case since the number of OpenMP thread is guaranteed to be at least 1). (N * p) / p is equivalent to N assuming there is no overflow. Such condition is often useful when the integer division cause some truncation but this is not the case here (it would be the case with something like (N / p) * id).

    Note that this code is not very safe for large N because sizeof(int) is often 4 and the multiplication is likely to cause overflows (resulting in an undefined behaviour). This is especially true on machines with many cores like on supercomputer nodes. It is better to use the size_t type which is usually an unsigned 64-bit type meant to be able to represent the size of any object (for example the size of an array).