I am struggling to find the proper way to execute a hybrid OpenMP/MPI job with MPICH (hydra).
I am easily able to launch the processes and they do make threads, but they are stuck bound to the same core as their master thread whatever type of -bind-to
I tried.
If I explicitly set GOMP_CPU_AFFINITY
to 0-15
I get all threads spread but only provided if I have 1 process per node. I don't want that, I want one process per socket.
Setting OMP_PROC_BIND=false
does not have a noticeable effect.
An example of many different combinations I tried
export OMP_NUM_THREADS=8
export OMP_PROC_BIND="false"
mpiexec.hydra -n 2 -ppn 2 -envall -bind-to numa ./a.out
What I get is all process sitting on one of the cores 0-7
with 100% and several threads on cores 8-15
but only one of them close to 100% (they are waiting on the first process).
Since libgomp
is missing the equivalent of the respect
clause of Intel's KMP_AFFINITY
, you could hack it around by providing a wrapper script that reads the list of allowed CPUs from /proc/PID/status
(Linux-specific):
#!/bin/sh
GOMP_CPU_AFFINITY=$(grep ^Cpus_allowed_list /proc/self/status | grep -Eo '[0-9,-]+')
export GOMP_CPU_AFFINITY
exec $*
This should work with -bind-to numa
then.