I'm a beginner in OpenMP learning about tasks and would like to apply them in the scenario below. The example per se is completely useless; the point is to parallelize a loop containing a break condition and that performs an IO operation. I left the break condition generic for the purpose of this question.
I attempted the following approach, but the program becomes insanely slow (probably because I'm creating too many tasks...). I also attempted using the taskloop construct but the results were almost the same. For now, I'm trying to figure out how to make the damn thing run reasonably fast before thinking on the break condition. Any advice on how I should proceed?
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
const int max_iters = 500000;
const char cmd_format[300] = "echo Iteration %d\n";
int main(int argc, char **argv) {
FILE *fp;
char ret[200];
char cmd[400];
#pragma omp parallel
#pragma omp single
for (int i = 0; i < max_iters; i++) {
#pragma omp task default(none) \
firstprivate(i,cmd,fp,ret) \
shared(cmd_format,max_iters)
{
sprintf((char *)&cmd, cmd_format, i);
fp = popen(cmd, "r");
while (!feof(fp)) {
fgets((char *)&ret, 200, fp);
if (/* some condition */) {
printf("Done\n");
i = max_iters;
}
}
pclose(fp);
}
}
return EXIT_SUCCESS;
}
I haven't tested the below code, but it should still show the idea. The problem is that you cannot easily abort parallel computation. The way to do this in OpenMP is cancellation, see https://www.openmp.org/spec-html/5.1/openmpse28.html#x144-1560002.20.
Here's the code that uses the cancel
construct:
#include <stdio.h>
#include <stdlib.h>
#include <omp.h>
const int max_iters = 500000;
const char cmd_format[300] = "echo Iteration %d\n";
int main(int argc, char **argv) {
FILE *fp;
char ret[200];
char cmd[400];
#pragma omp parallel
#pragma omp single
#pragma omp taskgroup
for (int i = 0; i < max_iters; i++) {
#pragma omp task default(none) \
firstprivate(i,cmd,fp,ret) \
shared(cmd_format,max_iters)
{
sprintf((char *)&cmd, cmd_format, i);
fp = popen(cmd, "r");
while (!feof(fp)) {
fgets((char *)&ret, 200, fp);
if (/* some condition */) {
#pragma omp atomic update
i = max_iters;
printf("Done\n");
#pragma omp cancel taskgroup
}
}
pclose(fp);
}
}
return EXIT_SUCCESS;
}
After the single I have introduced a taskgroup
construct, such that there's now a logical grouping of all the tasks that are created. Once your condition is satisfied, the cancel
construct triggers cancellation of a the tasks of the taskgroup
.
One thing to note is: this is not an immediate action. If a task has already started to execute, it will not be terminated unless it reaches a cancellation point or it reaches the end of its code region. Tasks that have been generated, but are sitting in the task pool and haven't started execution yet, are discarded.
So, when using this feature you need to make your code aware of this and handle these situations. So, for instance, if the condition for the termination can be true for multiple tasks, you need to make your code flexible enough such that two or more tasks may enter the if
statement and will try to record the result (in your case, set i
to max_iters
). That's why I have added the atomic
construct to ensure that there's not race when updating the i
variable, but a late task might still overwrite the result of an earlier task. Since what has to happen in your real code, depends on what your code actually does, I cannot give a better advice.