I am asked to write an OpenMP program in C such that the main thread distributes the work to other threads, and while they are working on their tasks, the main should periodically check whether they are done, and if not, it should increment a shared variable.
This is the function for the threads' task:
void work_together(int *a, int n, int number, int thread_count) {
# pragma omp parallel for num_threads(thread_count) \
shared(a, n, number) private(i) schedule(static, n/thread_count)
for (long i=0; i<n; i++) {
// do a task, such as:
a[i] = a[i] * number;
}
}
And it gets called from main:
int main(int argc, char *argv[]) {
int n = atoi(argv[1]);
int arr[n];
initialize(arr, n);
// this will be the shared variable
int number = 2;
work_together(arr, n, number, thread_count);
//I want to write a function or an if to check whether threads are still working
/* if (threads_still_working()) {
number++;
sleep(100);
}
*/
printf("There are %d threads\n", omp_get_num_threads());
}
thread_count
is initialized as 4
, and I tried executing it for large n
's (>10000), but the master thread will always wait for the other threads to finish executing the for loop, and will only continue the main when work_together()
returns: the printf() will always print that there's only one thread running.
Now, what would be a way to check from the master thread whether the other threads are still running, and do some incrementing if they are?
From the OpenMP standard one can read:
When a thread encounters a parallel construct, a team of threads is created to execute the parallel region. The thread that encountered the parallel construct becomes the master thread of the new team, with a thread number of zero for the duration of the new parallel region. All threads in the new team, including the master thread, execute the region. Once the team is created, the number of threads in the team remains constant for the duration of that parallel region.
Consequently, with the clause #pragma omp parallel for num_threads
all threads will be performing the parallel work (i.e., computing the iterations of the loop), which is something that you do not want. To get around this, you can implement part of the functionality of
`#pragma omp parallel for num_threads`
since, explicitly using the aforementioned clause will make the compiler automatically divide the iterations of the loop among the threads in the team, including the master thread of that team. The code would look the following:
# pragma omp parallel num_threads(thread_count) shared(a, n, number)
{
int thread_id = omp_get_thread_num();
int total_threads = omp_get_num_threads();
if(thread_id != 0) // all threads but the master thread
{
thread_id--; // shift all the ids
total_threads = total_threads - 1;
for(long i = thread_id ; i < n; i += total_threads) {
// do a task, such as:
a[i] = a[i] * number;
}
}
}
First, we ensure that all threads except the master (i.e., if(thread_id != 0)
) execute the loop to be parallelized, then we divided the iterations of the loop among the remaining threads (i.e.,
for(int i = thread_id ; i < n; i += total_threads)
). I have chosen a static distribution of chunk=1, you can choose a different one, but you will have to adapt the loop accordingly.
Now you just need to add the logic to:
Now, what would be a way to check from the master thread whether the other threads are still running, and do some incrementing if they are?
So that I do not give away too much I will add the pseudocode that you will have to covert to real code to make it work:
// declare two shared variable
// 1) to count the number of threads that have finished working count_thread_finished
# pragma omp parallel num_threads(thread_count) shared(a, n, number)
{
int thread_id = omp_get_thread_num();
int total_threads = omp_get_num_threads();
if(thread_id != 0) // all threads but the master thread
{
thread_id--; // shift all the ids
total_threads = total_threads - 1;
for(long i = thread_id ; i < n; i += total_threads) {
// do a task, such as:
a[i] = a[i] * number;
}
// count_thread_finished++
}
else{ // the master thread
while(count_thread_finished != total_threads -1){
// wait for a while....
}
}
}
Bear in mind, however, that since the variable count_thread_finished
is shared among threads, you will need to ensure mutual exclusion (e.g., using omp atomic) on its updates, otherwise you will have a race-condition. This should give you enough to keep going.
Btw: schedule(static, n/thread_count)
is mostly not needed since by default most OpenMP implementations already divide the iterations of the loop (among the threads) as continuous chunks.