Search code examples
taskopenmppragmastdthread

How to spawn a single thread with OpenMP (like a std::thread()) and use "#pragma omp single" and "#pragma omp for" afterwards?


I would simply like to spawn a background thread, like std::thread, but just with OpenMP. Is this possible? If yes, how ist it done? To better explain what I want to achieve, here the C++ Code of what I want to do:

//05.06.2024
//How to spawn a single thread with OpenMP (like a std::thread()) and use "#pragma omp single" and 
//"#pragma omp for" afterwards? This here does not work! (as expected)
//OS: Linux Mint 21.3, Kernel: 6.5.0-35-generic x86_64 bits
//compiler: g++ (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0

// g++ -O2 openmp_single_bkgtask_test.cpp -fopenmp 

#include <omp.h>
#include <iostream>
#include <unistd.h>

void backgroundWork()
{
    while(true)
    {
        std::cout<<"background work!"<<std::endl;
        usleep(1e6);
    }   
}

int main()
{           
    omp_set_num_threads(8); 
        
    #pragma omp parallel
    {
        #pragma omp single nowait
        #pragma omp task
        backgroundWork();
                
        while(true)
        {
            #pragma omp single
            std::cout<<"loading data"<<std::endl;
            //++++++++++
            //prog never comes beyond this point, because "#pragma omp single" is waiting after
            //"loading data" for the <task>-thread to finish (which of course does not happen
            //++++++++++
            #pragma omp for
            for(int i=0;i<16;i++)
            {
                std::cout<<"processing data "<<i<<std::endl;
            }                   
        }       
    }           
    
    return 0;
}

I know this will not work, as the omp single instruction will wait for all threads, but one is "missing" in the "task", so it "works" as expected. But how can I spawn a thread in OpenMP?


Solution

  • Here is solution using only tasks, and an omp single pragma for the whole parallel region:

    int main()
    {           
    
        #pragma omp parallel
        #pragma omp single
        {
            #pragma omp task
            backgroundWork();
                    
            int v = 0;
            while(true)
            {
                v++;
                #pragma omp critical
                std::cout<<v<<" loading data"<<std::endl;
                #pragma omp taskloop
                for(int i=0;i<4;i++)
                {
                    usleep(1e5);
                    #pragma omp critical
                    std::cout<<v<<" processing data "<<i<<std::endl;
                }                   
            }
        }           
        
        return 0;
    }
    

    As suggested by Jerôme Richard, the omp for is replaced by an omp taskloop. The critical pragmas just aim at a better formatted text output.

    Since the thread executing the single region does nothing during the taskloop execution, you may use one more thread compared to the number of cores. And another one in the case the background task does almost nothing.

    Note that there is actually no guarantee that the background task is executed first (the OpenMP runtime is allowed to schedule the tasks on its own), although in practice it will highly likely be. An alternative that guarantees the execution is based on sections:

    int main()
    {           
    
        #pragma omp parallel sections
        {
            #pragma omp section
            backgroundWork();
                    
            #pragma omp section
            {
                int v = 0;
                while(true)
                {
                    v++;
                    #pragma omp critical
                    std::cout<<v<<" loading data"<<std::endl;
                    #pragma omp taskloop
                    for(int i=0;i<4;i++)
                    {
                        usleep(1e5);
                        #pragma omp critical
                        std::cout<<v<<" processing data "<<i<<std::endl;
                    }   
                }                
            }
        }           
        
        return 0;
    }
    

    Third solution with sections and nested parallelism:

    int main()
    {           
    
        omp_set_nested(1);
    
        #pragma omp parallel sections num_threads(2)
        {
            #pragma omp section
            backgroundWork();
                    
            #pragma omp section
            {
                int v = 0;
                while(true)
                {
                    v++;
                    #pragma omp critical
                    std::cout<<v<<" loading data "<<omp_get_thread_num()<<std::endl;
                    #pragma omp parallel for num_threads(2)
                    for(int i=0;i<4;i++)
                    {
                        usleep(1e5);
                        #pragma omp critical
                        std::cout<<v<<" processing data "<<i<<" "<<omp_get_thread_num()<<std::endl;
                    }   
                }                
            }
        }           
        
        return 0;
    }