How could I make the parallel of this with OpenMP 3.1? I have tried a collapse but the compiler says this:
error: initializer expression refers to iteration variable ‘k’
for (j = k+1; j < N; ++j){
And when I try a simple parallel for, the result is like the threads sometimes do the same and jump things so sometimes the result is greater and other times is less
int N = 100;
int *x;
x = (int*) malloc ((N+1)*sizeof(int));
//... initialization of the array x ...
// ...
for (k = 1; k < N-1; ++k)
{
for (j = k+1; j < N; ++j)
{
s = x[k] + x[j];
if (fn(s) == 1){
count++;
}
}
Count must be 62 but is random
Based on the code snippet that you have provided, and according to the restrictions to nested parallel loops specified by the OpenMP 3.1
standard:
The iteration count for each associated loop is computed before entry to the outermost loop. If execution of any associated loop changes any of the values used to compute any of the iteration counts, then the behavior is unspecified.
Since the iterations of your inner loop depend upon the iterations of your outer loop (i.e., j = k+1
) you can not do the following:
#pragma omp parallel for collapse(2) schedule(static, 1) private(j) reduction(+:count)
for (k = 1; k < N-1; ++k)
for (j = k+1; j < N; ++j)
...
Moreover, from the OpenMP 3.1 "Loop Construct" section (relevant to this question) one can read:
for (init-expr; test-expr; incr-expr) structured-block
where init-expr
is one of the following:
...
integer-type var = lb
...
and test-expr
:
...
var relational-op b
with the restriction of lb
and b
of:
Loop invariant expressions of a type compatible with the type of var.
Notwithstanding, as kindly pointed out by @Hristo Iliev, "that changed in 5.0 where support for non-rectangular loops was added.". As one can read from the OpenMP 5.0
"Loop Construct" section, now the restriction on lb
and b
are:
Expressions of a type compatible with the type of var that are loop invariant with respect to the outermost associated loop or are one of the following (where var-outer, a1, and a2 have a type compatible with the type of var, var-outer is var from an outer associated loop, and a1 and a2 are loop invariant integer expressions with respect to the outermost loop):
...
var-outer + a2
...
Alternatively to the collapse
clause you can use the normal parallel for. Bear in mind that you have a race condition during the update of the variable count.
#pragma omp parallel for schedule(static, 1) private(j) reduction(+:count)
for (k = 1; k < N-1; ++k){
for (j = k+1; j < N; ++j)
{
s = x[k] + x[j];
if (fn(s) == 1){
count++;
}
}
Importante note although the k
does not have to be private, since it is part of the loop to be parallelized and OpenMP will implicitly make it private, the same does not apply to the variable j
. Hence, one of the reason why:
Count must be 62 but is random
the other was the lack of the reduction(+:count)
.