Search code examples
pythoncopenmpctypes

Calling an OpenMP C library from Python and environment variables


I would like to call some OpenMP program from Python, changing the number of threads (OMP_NUM_THREADS) and their bindings (OMP_PLACES, OMP_PROC_BIND). I wrote this program:

import os
from ctypes import cdll

lib = cdll.LoadLibrary("/home/fayard/Desktop/libf.so")

nb_socket = 2
nb_core_per_socket = 14
nb_thread_per_core = 2

n = nb_socket * nb_core_per_socket * nb_thread_per_core

for nb_thread in range(1, n + 1):
    os.environ['OMP_NUM_THREADS'] = str(nb_thread)
    print("nb_thread: {}, omp_get_num_threads: {}".format(
        nb_thread, lib.num_threads()))

The OpenMP library is the following:

#include <omp.h>

extern "C" {

int num_threads() {
  int ans;
#pragma omp parallel
  {
#pragma omp single
    ans = omp_get_num_threads();
  }

  return ans;
}

}

and is compiled with:

g++ -c -fPIC -fopenmp f.cpp -o f.o
g++ -shared -fopenmp -Wl,soname,libf.so -o libf.so f.o

When I run python program.py, I get:

nb_thread: 1, omp_get_num_threads: 56
...
nb_thread: 56, omp_get_num_threads: 56

which is not what I want! I also realized, that when compiled with the Intel compilers with exactly the same arguments, I get:

nb_thread: 1, omp_get_num_threads: 1
...
nb_thread: 56, omp_get_num_threads: 1

Any thoughts on what's going wrong?


Solution

  • The environment variables only control the initial settings of the internal control variables.

    [OpenMP 4.5] 4. Environment variables

    Modifications to the environment variables after the program has started, even if modified by the program itself, are ignored by the OpenMP implementation. However, the settings of some of the ICVs can be modified during the execution of the OpenMP program by the use of the appropriate directive clauses or OpenMP API routines.

    You can write a small wrapper around omp_set_num_threads, however you cannot change the bindings dynamically.

    Unfortunately it there is no clean solution to unload shared librareies in ctypes. An alternative would be to run an actual program using subprocess, instead of loading a library, but then you have a different interface.

    If you must use shared libraries and control dynamic bindings, you could manually do some magic with sched_setaffinity in your shared library invoked by python.

    The reason why gcc and intel runtimes behave differently, is likely because you set the environment variable after loading the library and their initialization is done differently.