I do not understand the difference between omp_get_num_threads() and omp_get_max_threads(). I copy the demo code as the following.
omp_set_nested(1);
omp_set_max_active_levels(10);
omp_set_dynamic(0);
omp_set_num_threads(2);
#pragma omp parallel
{
omp_set_num_threads(3);
#pragma omp parallel
{
omp_set_num_threads(4);
#pragma omp single
{
std::cout << omp_get_max_active_levels() << " " << omp_get_num_threads() << " "
<< omp_get_max_threads() << std::endl;
}
}
#pragma omp barrier
#pragma omp single
{
std::cout << omp_get_max_active_levels() << " " << omp_get_num_threads() << " "
<< omp_get_max_threads() << std::endl;
}
}
And then I got the following output.
10 3 4
10 3 4
10 3 4
10 3 3
I have checked the official documentation, but I am still confused about that.
From documentation:
omp_get_num_threads
The
omp_get_num_threads
routine returns the number of threads in the team executing theparallel
region to which the routine region binds. If called from the sequential part of a program, this routine returns 1.
omp_get_max_threads
The value returned by
omp_get_max_threads
is the value of the first element of the nthreads-var ICV of the current task. This value is also an upper bound on the number of threads that could be used to form a new team if a parallel region without anum_threads
clause were encountered after execution returns from this routine.
The figure below illustrates the flow of threads. Your output may be incorrect, and I can't reproduce it with clang+libomp or gcc+libGOMP.
The omp_get_max_threads
always returns the number of threads that a new parallel
construct can create, if the number of threads is not specified along with it. When you set 4 on omp_set_num_threads
at inner parallel region, the maximum number of new different threads that can be created is 4, but in that region 3 are in use. For the outer parallel region, the max is 3, and 2 are in use.
In a serial code, out of any pragmas, the number of threads is 1, but the max is the default for the system (usually the number of cores), if you not changed it via omp_set_num_threads
or OMP_NUM_THREADS
environment variable