I am studying OpenMP and have written an implementation of shaker sorting. There are 2 consecutive cycles here, and in order for them to be called sequentially, I added blockers in the form of omp_init_lock, omp_destroy_lock, but still the result is incorrect. Please tell me how you can parallelize two consecutive cycles. My code is below:
int Left, Right;
Left = 1;
Right = ARR_SIZE;
while (Left <= Right)
{
omp_init_lock(&lock);
#pragma omp parallel reduction(+:Left) num_threads(4)
{
#pragma omp for
for (int i = Right; i >= Left; i--) {
if (Arr[i - 1] > Arr[i]) {
int temp;
temp = Arr[i];
Arr[i] = Arr[i - 1];
Arr[i - 1] = temp;
}
}
Left++;
}
omp_destroy_lock(&lock);
omp_init_lock(&lock);
#pragma omp parallel reduction(+:Right) num_threads(4)
{
#pragma omp for
for (int i = Left; i <= Right; i++) {
if (Arr[i - 1] > Arr[i]) {
int temp;
temp = Arr[i];
Arr[i] = Arr[i - 1];
Arr[i - 1] = temp;
}
}
Right--;
}
omp_destroy_lock(&lock);
}
You seem to have several misconceptions about how OpenMP works.
Your code looks like you expected them to work like pragma omp sections
. Side note: Unless you have absolutely no other choice and/or you know exactly what you are doing, don't use sections. They don't scale well.
omp_init_lock
initializes a lock object. It doesn't acquire it. Likewise the destroy function deallocates it, it doesn't release the lock. If you ever want to acquire a lock, use omp_set_lock
and omp_unset_lock
on locks that you initialize once before you enter a parallel section.Generally speaking, if you need a lock for an extended section of your code, it will not parallelize. Read up on Amdahl's law. Locks are only useful if used rarely or if the chance of two threads competing for the same lock at the same time is low.
Your code contains race conditions. Since you used pragma omp for
, two different threads may execute the i'th and (i-1)'th iteration at the same time. That means they will touch the same integers. That's undefined behavior and will lead to them stepping on each other's toes, so to speak.
I have no idea what you wanted to do with those reductions.
Well, traditional shaker sort cannot work in parallel because within one iteration of the outer loop, an element may travel the whole distance to the end of the range. That requires an amount of inter-thread coordination that is infeasible.
What you can do is a variation of bubble sort where each thread looks at two values and swaps them. Move this window back and forth and values will slowly travel towards their correct position.
This should work:
#include <utility>
// using std::swap
void shake_sort(int* arr, int n) noexcept
{
using std::swap;
const int even_to_odd = n / 2;
const int odd_to_even = (n - 1) / 2;
bool any_swap;
do {
any_swap = false;
# pragma omp parallel for reduction(|:any_swap)
for(int i = 0; i < even_to_odd; ++i) {
int left = i * 2;
int right = left + 1;
if(arr[left] > arr[right]) {
swap(arr[left], arr[right]);
any_swap = true;
}
}
# pragma omp parallel for reduction(|:any_swap)
for(int i = 0; i < odd_to_even; ++i) {
int left = i * 2 + 1;
int right = left + 1;
if(arr[left] > arr[right]) {
swap(arr[left], arr[right]);
any_swap = true;
}
}
} while(any_swap);
}
Note how you can't exclude the left and right border because one outer iteration cannot guarantee that the value there is correct.
std::swap
makes the code more readablenum_threads
. OpenMP can figure this out itself