I have a Python module which wraps a C++ library. The library uses MPI and is compiled with mpicxx. Everything works great on some machines, but on others I get this:
ImportError: ./_pyCombBLAS.so: undefined symbol: _ZN3MPI3Win4FreeEv
So there's an undefined symbol from the MPI library. As far as I can tell mpicxx should link everything in, yet it doesn't. Any ideas why?
It turns out that the fault was that the wrong libraries were being loaded. As you know, a cluster is likely to have several versions of MPI installed, sometimes the same version is compiled with several compilers. These are all likely to have the same filenames. In my case even though I compiled with, say MPICH GNU, the default path was to the OpenMPI PGI libraries. I didn't realize this, I thought compiling with MPICH GNU would mean the MPICH GNU libraries would be found at runtime.
Of course I couldn't actually use PGI-compiled OpenMPI because Python was compiled with GCC and PGI doesn't emit binaries fully compatible with GCC.
The solution is to set the LD_LIBRARY environment variable to match the MPI distribution you used to compile your code.