Difference between num_threads vs. omp_set_num_threads vs OMP_NUM_THREADS
Think of it like scope. Option 3 (num_threads) sets the number of threads for the current team of threads only. The other options are global/state settings. I generally don't set the number of threads and instead I just use the defaults. When I do change the number of threads it's usually only in special cases so I use option three so that the next time I use a parallel team it goes back to the global (default) setting. See the code below. After I use option 3 the next team of threads goes back to the last global setting.
#include <stdio.h>
#include <omp.h>
int main() {
#pragma omp parallel
{
#pragma omp single
{
printf("%d\n", omp_get_num_threads());
}
}
omp_set_num_threads(8);
#pragma omp parallel
{
#pragma omp single
{
printf("%d\n", omp_get_num_threads());
}
}
#pragma omp parallel num_threads(2)
{
#pragma omp single
{
printf("%d\n", omp_get_num_threads());
}
}
#pragma omp parallel
{
#pragma omp single
{
printf("%d\n", omp_get_num_threads());
}
}
}
4 8 2 8
OMP_NUM_THREADS
and omp_set_num_threads()
are not equivalent. The environment variable is only used to set the initial value of the nthreads-var ICV (internal control variable) which controls the maximum number of threads in a team. omp_set_num_threads()
can be used to change the value of nthreads-var at any time (outside of any parallel regions, of course) and affects all subsequent parallel regions. Therefore setting a value, e.g. n
, to OMP_NUM_THREADS
is equivalent to calling omp_set_num_threads(n)
before the very first parallel region is encountered.
The algorithm to determine the number of threads in a parallel region is very clearly described in the OpenMP specification that is available freely on the OpenMP website:
if a
num_threads
clause existsthen let ThreadsRequested be the value of the
num_threads
clause expression;else let ThreadsRequested = value of the first element of nthreads-var;
That priority of the different ways to set nthreads-var is listed in the ICV Override Relationships part of the specification:
The
num_threads
clause andomp_set_num_threads()
override the value of theOMP_NUM_THREADS
environment variable and the initial value of the first element of the nthreads-var ICV.
Translated into human language, that is:
OMP_NUM_THREADS
(if present) specifies initially the number of threads;- calls to
omp_set_num_threads()
override the value ofOMP_NUM_THREADS
; - the presence of the
num_threads
clause overrides both other values.
The actual number of threads used is also affected by whether dynamic team sizes are enabled (dyn-var ICV settable via OMP_DYNAMIC
and/or omp_set_dynamic()
), by whether a thread limit is imposed by thread-limit-var (settable via OMP_THREAD_LIMIT
), as well as by whether nested parallelism (OMP_NESTED
/ omp_set_nested()
) is enabled or not.