Search code examples
windows-10mpimpich

MPI - code only using one of two NUMA nodes


This is a long shot but perhaps someone can help. I'm running a model (SWAN) on Windows 10. I'm using the MPI version using MPICH2 (1.4.1p1).

I have two NUMA nodes with 36 cores each. For some reason I can't run the model on all 72 cores.

I'm running the model using mpiexec -n <np> swan.exe or swanrun inputfile <np>. If I specify mpiexec -n 72 the model starts 72 processes but only uses the 36 cores of one node. Even if I run 2 or more models at the same time they run on the same node leaving 36 cores pretty much idle.

I'm assuming I made a mistake when installing MPICH2 but can't quite figure out where I went wrong yet. I simply installed MPICH2 using the binary provided here (http://www.mpich.org/static/downloads/1.4.1p1/) Is there some option I overlooked where I have to install it for both nodes separately?


Solution

  • After some digging I realised that I had multiple versions of MPI installed on my machine. While I'm still not sure as to why my model would only run on one of the NUMA nodes at a time (I'm not sure which MPI version mpiexec was calling) I uninstalled all MPI versions and did a clean reinstall. I can now run on all 72 cores.