So I started using OpenMP (multithreading) to increase the speed of my matrix multiplication and I witnessed weird things: when I turn off OpenMP Support (in Visual Studio 2019) my nested for-loop completes 2x faster. So I removed "#pragma omp critical" to test if it slows down the proccess significantly and the proccess went 4x faster than before (with OpenMP Support On).
Here's my question: is "#pragma omp critical" important in nested loop? Can't I just skip it?
#pragma omp parallel for collapse(3)
for (int i = 0; i < this->I; i++)
{
for (int j = 0; j < A.J; j++)
{
m.matrix[i][j] = 0;
for (int k = 0; k < A.I; k++)
{
#pragma omp critical
m.matrix[i][j] += this->matrix[i][k] * A.matrix[k][j];
}
}
}
Here's my question: is "#pragma omp critical" important in nested loop? Can't I just skip it?
If the matrices m
, this
and A
are different you do not need any critical region. Instead, you need to ensure that each thread will write to a different position of the matrix m
as follows:
#pragma omp parallel for collapse(2)
for (int i = 0; i < this->I; i++)
{
for (int j = 0; j < A.J; j++)
{
m.matrix[i][j] = 0;
for (int k = 0; k < A.I; k++)
{
m.matrix[i][j] += this->matrix[i][k] * A.matrix[k][j];
}
}
}
The collapse
clause will assign to each thread a different pair (i, j)
therefore there will not be multiple threads writing to the same position of the matrix m
(i.e., race-condition).