Search code examples
rfortranopenmp

How to make an OpenMP parallel code to run serially by default?


I have a parallel code with OpenMP which is a mix of Fortran, C and R. The Fortran and C are the back-end or solver of the code (it's a computational code) and the R part is just an interface to launch a simulation, meaning that the simulation parameters and configuration is defined using an R script that in turn calls the functions and methods in the back-end part of the the code (you know how it works).
The parallelism is implemented in the Fortran part of the code and by default OpenMP uses all the cores/threads. This is not the desired behavior I want the code run serially by default and give the user the possibility of defining the number of cores/threads to use.
To this end I defined a module as follows

module thread_control
    use omp_lib
    implicit none
contains
    subroutine set_default_threads()
        ! Get the number of threads set by the environment
        integer :: nthreads
        nthreads = omp_get_max_threads()

        ! If no environment variable is set, default to 1 thread
        if (nthreads == omp_get_num_procs()) then
            call omp_set_num_threads(1)
        endif

    end subroutine set_default_threads
end module thread_control

Then I call to the subroutine of the above module in the files where I have parallel regions using use thread_control and call set_default_threads().
Here is an example of the call to the subroutine and the type of OpenMP directive I'm using:

call set_default_threads()
!$OMP PARALLEL PRIVATE(param1, param2, param3, param4, param5), &
    !$OMP& PRIVATE(Num1, Num2, Num3, Num4)
    !$OMP DO
    ! Loop over the mesh
    ! Here goes a huge for loop and conditionals (if statements)
    !$OMP END DO
!$OMP END PARALLEL

Here goes a second example

!$OMP PARALLEL PRIVATE(Tmp)
    !$OMP SECTIONS
    !$OMP SECTION
! some stuff happens here....

    !$OMP SECTION
! some stuff happens here....

    !$OMP SECTION
! some stuff happens here....

 !$OMP END SECTIONS NOWAIT
!$OMP END PARALLEL

I can't share the whole code, it's a big code base and it's not open source so unfortunately there is no way I can share it with you. Now the problem is when I run my R script in an R session (either RStudio or an interactive R session in my terminal) using the following command

> source("PATH-TO-MY-R-SCRIPT")

It gives me an error like

libgomp: Out of memory allocating 418040474304 bytes

I didn't have this problem before introducing the thread_control.f90 module.
Now the interesting part is that if I use the following command

> system("OMP_NUM_THREADS=4 Rscript PATH-TO-MY-R-SCRIPT")

Things work smoothly.
Another interesting observation is that before introducing the thread_control.f90 module I experimented with my R script to see if I can override the number of cores/threads using my R script so I added the following lines to the beginning of my R script

# R script: simulation.R

# Set the number of threads (e.g., 4)
num_threads <- 4
Sys.setenv(OMP_NUM_THREADS = num_threads) 

Despite the introduction of the above lines my code used all the 16 existing threads on my system, as it does by default using OpenMP. So it seems to me that there is a conflict between R environment and Fortran code.
I'm not sure if this amount of information could be enough for some geeks out here to help me out with my problem, but I hope it is. Although if you think you need some more details please feel free to ask, I'll provide you with necessary details.
I asked this question from chatgpt and it gave an answer along the following lines

Potential Causes and Solutions

  • Differences in Execution Environment:

    • When you use system("OMP_NUM_THREADS=4 Rscript ..."), a new R process is created with its own environment, which can differ from the environment of an interactive R session. This new process might have different memory management or threading behavior.

    • When running the script using source(), the code runs within the current R session, which might have different memory constraints or conflicting settings.

  • Threading Issues:

    • Running OpenMP parallel regions in a shared environment (like an interactive R session) can cause threading issues if the environment isn't properly isolated. This can lead to race conditions, deadlocks, or memory allocation issues.

    • OpenMP may not always behave as expected within an R session due to how R handles its memory and threading. This is especially true if other packages or R's internal processes are also using threads or if memory is fragmented.

  • Memory Management:
    • The "Out of memory" error might be caused by trying to allocate an enormous amount of memory due to an issue with how memory is being requested or managed within the parallel regions.
    • When running within the R session, memory fragmentation or other processes within the same session may cause unpredictable memory availability.

Solutions

Here are some strategies to address these issues:

  1. Run Code in a Separate Process

    • Continue Using system(): Since running the R script with system("OMP_NUM_THREADS=4 Rscript ...") works correctly, this suggests that separating the execution from the interactive session resolves the issues. Consider using this method as the primary way to run your script.
    • Isolate Execution: If you need to execute the script within R but still want to isolate it, you could wrap the logic that runs your Fortran code in an R function and then run that function in a separate R process using parallel::mclapply() or a similar approach.
  2. Control Threading and Memory

    • Limit Threads in Interactive Sessions: You can try limiting the number of threads even more conservatively when running interactively, by explicitly setting OMP_NUM_THREADS=1 when using source(). This can help avoid threading conflicts.
    • Check Memory Allocation: Review your Fortran code for any large memory allocations, especially within parallel regions. Ensure that these allocations are necessary and that they don’t inadvertently request too much memory.
  3. Debugging

    • Use Debugging Tools: Tools like Valgrind (for memory errors) or GDB (for debugging segmentation faults) can be helpful if you want to track down the specific cause of the errors when running in source(). This might reveal issues related to memory access or threading that only appear in certain environments.
    • Add Debugging Information: Print statements or logging in both R and Fortran can help trace where the issue occurs. Start by identifying if the issue is in the Fortran code or in the interaction between R and Fortran.

P.S.
My code is compiled using Makefile and I'm working on Ubuntu.


Solution

  • There is a logical problem in the fortran module used to give the user the possibility of defining the number of threads in an R script. The problem is the following :

    • This module uses omp_get_max_threads() to obtain the number of threads. However, omp_get_max_threads() does not necessarily return the number of threads based on the OMP_NUM_THREADS environment variable alone. By default (if OMP_NUM_THREADS is not set), omp_get_max_threads() returns the maximum number of threads available based on the number of hardware cores or processors. This means that omp_get_max_threads() doesn’t provide any information about whether OMP_NUM_THREADS is set explicitly by the user. It just returns what OpenMP is going to use based on the system.
    • In this module one tries to default to serial execution by comparing omp_get_max_threads() with omp_get_num_procs(). This logic is also flawed because: omp_get_num_procs() returns the number of physical processors or cores available on the machine, not the number of threads OpenMP is using. Even if the environment variable OMP_NUM_THREADS is set, this comparison would still fail to properly detect the user's intention.

    So the analysis explained in the two above bullet points goes against the fact that the user is passing the number of desired threads using OMP_NUM_THREADS in the R script. The check for OMP_NUM_THREADS is not explicitly addressed in the fortran module. Here is what the proper module should look like:

    module thread_control
        use omp_lib
        implicit none
    contains
        function get_default_threads() result(nthreads)
            integer :: nthreads, ierror
            character(len=32) :: omp_env
    
            ! Check if the OMP_NUM_THREADS environment variable is set
            call get_environment_variable('OMP_NUM_THREADS', omp_env, status=ierror)
    
            if (ierror == 0) then
                ! If OMP_NUM_THREADS is set, convert to an integer
                read(omp_env, *) nthreads
            else
                ! If not set, default to 1 thread (serial execution)
                nthreads = 1
            endif
        end function get_default_threads
    end module thread_control
    

    The above version directly retrieves the value of OMP_NUM_THREADS using the get_environment_variable function. This ensures that if the user sets OMP_NUM_THREADS, the Fortran code captures that value and uses it.
    The call get_environment_variable('OMP_NUM_THREADS', omp_env, status=ierror) line fetches the environment variable, and if successful (ierror == 0), the value is read and converted to an integer using read(omp_env, *) nthreads.
    One should keep in mind that the environment variables are treated by the system as string so a conversion is necessary.
    After defining the module as explained above; in the Fortran files where there exist a parallel region you should add use thread_control once at the beginning of the routine or program (Fortran file in general) and then right before the parallel region you should add nthreads = get_default_threads() and eventually the parallel region should contain NUM_THREADS(nthreads) like the following

        ! Get the number of threads
        nthreads = get_default_threads()
    !$OMP PARALLEL NUM_THREADS(nthreads) PRIVATE(Tmp)
        !$OMP SECTIONS
        !$OMP SECTION
    ! some stuff happens here....
    
        !$OMP SECTION
    ! some stuff happens here....
    
        !$OMP SECTION
    ! some stuff happens here....
    
     !$OMP END SECTIONS NOWAIT
    !$OMP END PARALLEL
    

    Now you can simply add the following block at the very beginning of the R script before any call to the functions that use parallel process

    # Set the number of threads (e.g., 4)
     num_threads <- 4
     Sys.setenv(OMP_NUM_THREADS = num_threads)
    

    and this should be taken into account by the fortran code and OpenMP.