Search code examples
c++loopsparallel-processingopenmp

How do only a task of nested loop run on thread on OpenMP


I'm trying to make a program that multiplies two arrays in parallel so that each thread multiplies a row by a column. The problem is that if I put the omp for in the outside for, the thread will execute the entire internal for instead of just executing the task, and if I put the omp for in the inside for, the for from outside will run multiple times on multiple threads because it is in the scope of 'omp parallel'. I want to run only the task in the thread and I do not want the outside for run multiple times.

for (int line = 0; line < n; ++line) {

    for (int column = 0; column < n; ++column) {

       // only that need to run in new thread
        multiply_line_per_column(line, column);

    }

}

Solution

  • One of the options is to use collapse clause: https://stackoverflow.com/a/13357158/2485717

    You may also rewrite your for loop to avoid being nested:

    for (int i = 0; i < n * n; ++i) {
        int line = i % n;
        int column = i / n;
        multiply_line_per_column(line, column);
    }
    

    As pointed out by @Hristo Iliev in the comment, there will be considerable additional cost from integer division and modulo operators.

    The drawback is more obvious when n is not a power of 2.