I have compiled a weather forecasting software with openmpi in double precision on Ubuntu 14.04 and Intel ifort compiler. However I am not able to figure out few issues. I need to figure out the number of processors I need to send to mpirun. This is the output of lscpu
x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 2
Core(s) per socket: 2
Socket(s): 1
NUMA node(s): 1
Vendor ID: GenuineIntel
CPU family: 6
Model: 60
Stepping: 3
CPU MHz: 800.000
BogoMIPS: 6784.93
Virtualization: VT-x
L1d cache: 32K
L1i cache: 32K
L2 cache: 256K
L3 cache: 3072K
NUMA node0 CPU(s): 0-3output of lscpu
This is the command that I am using to run my software mpirun -np 4 aaa. But When I do this I get these errors -
MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD
with errorcode 1001.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
When I set np to 1 it runs successfully but does not use the CPU completely. CPU usage varies from 3% to 35% but the memory usage is almost 100% and the system freezes for about ten minutes and exits with the error message
forrtl severe(41) insufficient virtual memory.
I have run WRF (the software associated with this question is not WRF) with multiple semaphores and I have not experienced any speed or memory issues. I could recompile to single precision but before I do that I want to be able to figure out the number of cores(processors) to be sent to mpirun.
Most Intel CPUs (including the one you are using) have a virtual execution unit that allows two simultaneous instruction streams commonly called "hyperthreading." To the Linux kernel, this appears as an extra CPU core. Hence, lscpu
tells you there are four CPU cores (CPU(s): 4
). Looking carefully at the rest of the output, you will see that there are, in fact, only two CPU cores:
Thread(s) per core: 2 <--- this is hyperthreading
Core(s) per socket: 2
Socket(s): 1
I don't generally recommend running multiple MPI processes on a single physical CPU core even if there is hyperthreading. It tends to lead to detrimental performance, and, in your case, sometimes crashes. Try using mpiexec -np 2 aaa
and see what happens. If it crashes again, there is something else wrong.
When I set np to 1 it runs successfully but does not use the CPU completely. CPU usage varies from 3% to 35% but the memory usage is almost 100% and the system freezes for about ten minutes and exits with the error message forrtl severe(41) insufficient virtual memory.
You may need to run a smaller problem size. This machine doesn't have enough physical memory to satisfy the requested allocations and is using virtual memory (essentially hard disk space) to try to fulfill them, but still running out. In any case, you don't want to be using virtual memory when running a simulation (it's ~1000x slower than main memory which is already slow).