Search code examples
c++multithreadingtaskopenmp

All OpenMP Tasks running on the same thread


I have wrote a recursive parallel function using tasks in OpenMP. While it gives me the correct answer and runs fine I think there is an issue with the parallelism.The run-time in comparison with a serial solution does not scale in the same other parallel problem I have solved without tasks have. When printing each thread for the tasks they are all running on thread 0. I am compiling and running on Visual Studio Express 2013.

int parallelOMP(int n)
{

    int a, b, sum = 0;
    int alpha = 0, beta = 0;

    for (int k = 1; k < n; k++)
    {

        a = n - (k*(3 * k - 1) / 2);
        b = n - (k*(3 * k + 1) / 2);


        if (a < 0 && b < 0)
            break;


        if (a < 0)
            alpha = 0;

        else if (p[a] != -1)
            alpha = p[a];

        if (b < 0)
            beta = 0;

        else if (p[b] != -1)
            beta = p[b];


        if (a > 0 && b > 0 && p[a] == -1 && p[b] == -1)
        {
            #pragma omp parallel
            {
                #pragma omp single
                {
                    #pragma omp task shared(p), untied
                    {
                        cout << omp_get_thread_num();
                        p[a] = parallelOMP(a);
                    }
                    #pragma omp task shared(p), untied
                    {
                        cout << omp_get_thread_num();
                        p[b] = parallelOMP(b);
                    }
                    #pragma omp taskwait
                }
            }

            alpha = p[a];
            beta = p[b];
        }

        else if (a > 0 && p[a] == -1)
        {
            #pragma omp parallel
            {
                #pragma omp single
                {
                    #pragma omp task shared(p), untied
                    {
                        cout << omp_get_thread_num();
                        p[a] = parallelOMP(a);
                    }

                    #pragma omp taskwait
                }
            }

            alpha = p[a];
        }

        else if (b > 0 && p[b] == -1)
        {
            #pragma omp parallel
            {
                #pragma omp single
                {
                    #pragma omp task shared(p), untied
                    {
                        cout << omp_get_thread_num();
                        p[b] = parallelOMP(b);
                    }

                    #pragma omp taskwait
                }
            }

            beta = p[b];
        }


        if (k % 2 == 0)
            sum += -1 * (alpha + beta);
        else
            sum += alpha + beta;


    }

    if (sum > 0)
        return sum%m;
    else
        return (m + (sum % m)) % m;
}

Solution

  • Actual Problem:

    You are using Visual Studio 2013.

    Visual Studio has never supported OMP versions beyond 2.0 (see here).

    OMP Tasks are a feature of OMP 3.0 (see spec).

    Ergo, using VS at all means no OMP tasks for you.

    If OMP Tasks are an essential requirement, use a different compiler. If OMP is not an essential requirement, you should consider an alternative parallel task handling library. Visual Studio includes the MS Concurrency Runtime, and the Parallel Patterns Library built on top of it. I have recently moved from OMP to PPL due to the fact I'm using VS for work; it isn't quite a drop-in replacement but it is quite capable.


    My second attempt at solving this, again preserved for historical reasons:

    So, the problem is almost certainly that you're defining your omp tasks outside of a omp parallel region.

    Here's a contrived example:

    void work()
    {
        #pragma omp parallel
        {
            #pragma omp single nowait
            for (int i = 0; i < 5; i++)
            {
                #pragma omp task untied
                {
                    std::cout << 
                        "starting task " << i << 
                        " on thread " << omp_get_thread_num() << "\n";
    
                    sleep(1);
                }
            }
        }
    }
    

    If you omit the parallel declaration, the job runs serially:

    starting task 0 on thread 0
    starting task 1 on thread 0
    starting task 2 on thread 0
    starting task 3 on thread 0
    starting task 4 on thread 0
    

    But if you leave it in:

    starting task starting task 3 on thread 1
    starting task 0 on thread 3
    2 on thread 0
    starting task 1 on thread 2
    starting task 4 on thread 2
    

    Success, complete with authentic misuse of shared output resources.

    (for reference, if you omit the single declaration, each thread will run the loop, resulting in 20 tasks being run on my 4 cpu VM).


    Original answer included below for completeness, but no longer relevant!

    In every case, your omp task is a single, simple thing. It probably runs and completes immediately:

    #pragma omp task shared(p), untied
    cout << omp_get_thread_num();
    
    #pragma omp task shared(p), untied
    cout << omp_get_thread_num();
    
    #pragma omp task shared(p), untied
    cout << omp_get_thread_num();
    
    #pragma omp task shared(p), untied
    cout << omp_get_thread_num();
    

    Because you never start one long-running task before firing off the next task, everything will probably run on the first allocated thread.

    Perhaps you meant to do something like this?

    if (a > 0 && b > 0 && p[a] == -1 && p[b] == -1)
    {
        #pragma omp task shared(p), untied
        {
            cout << omp_get_thread_num();
            p[a] = parallelOMP(a);
        }
    
        #pragma omp task shared(p), untied
        {
            cout << omp_get_thread_num();
            p[b] = parallelOMP(b);
        }
    
        #pragma omp taskwait
    
        alpha = p[a];
        beta = p[b];
    }