I'm starting with OpenMP now, and need to achieve parallelism in this task with OpenMP:
for (i = 0; i < N; i++){
index += i*alpha + 4;
sum += array[index];
}
I intend to use reduction operator to get the correct value of sum variable , but in order to do that I need to solve the dependency of the index variable first. How can I do that?
Thanks for all the help!
The loop cannot be straightforwardly parallelized because of the sequential dependency. However, it can be first split in two parts:
for (i = 0; i < N; i++) {
index += i*alpha + 4;
tmp[i] = index;
}
for (i = 0; i < N; i++) {
sum += array[tmp[i]];
}
Once split, the first loop exhibits a scan pattern and can be parallelized although the parallel version may not be faster than the sequential one. The second loop is a simple reduction that can be quite easily parallelized.
Performing a scan patterns with OpenMP is usually a bit tricky. Hopefully, because alpha
seems to be a loop constant, all the value of index
can easily predicted thanks to basic math properties:
index_i = index_init + 0*alpha+4 + 1*alpha+4 + 2*alpha+4 + ... + i*alpha+4
= index_init + (0+1+2+...+i)*alpha + 4*(i+1)
= index_init + (i*(i+1)/2)*alpha + 4*(i+1)
= index_init + (i+1)*(i*alpha + 8)/2
Thus we can write the final resulting code:
#pragma omp parallel for reduction(+:sum)
for (i = 0; i < N; i++) {
sum += array[index + (i*(i+1)/2)*alpha + 4*(i+1)];
}
Moreover, if alpha
is an integer and index
starts from zero, the array indexing can be made slightly faster:
#pragma omp parallel for reduction(+:sum)
for (i = 0; i < N; i++) {
sum += array[(i+1)*(i*alpha+8)/2];
}