While trying to install Openmpi with CUDA support I am getting some make file failures.
btl_uct_module.c: In function ‘mca_btl_uct_reg_mem’:
btl_uct_module.c:214:22: error: ‘UCT_MD_MEM_ACCESS_REMOTE_GET’ undeclared (first use in this function)
uct_flags |= UCT_MD_MEM_ACCESS_REMOTE_GET;
^
btl_uct_module.c:214:22: note: each undeclared identifier is reported only once for each function it appears in
btl_uct_module.c:217:22: error: ‘UCT_MD_MEM_ACCESS_REMOTE_PUT’ undeclared (first use in this function)
uct_flags |= UCT_MD_MEM_ACCESS_REMOTE_PUT;
^
btl_uct_module.c:220:22: error: ‘UCT_MD_MEM_ACCESS_REMOTE_ATOMIC’ undeclared (first use in this function)
uct_flags |= UCT_MD_MEM_ACCESS_REMOTE_ATOMIC;
^
btl_uct_module.c:225:21: error: ‘UCT_MD_MEM_ACCESS_ALL’ undeclared (first use in this function)
uct_flags = UCT_MD_MEM_ACCESS_ALL;
^
Makefile:1912: recipe for target 'btl_uct_module.lo' failed
make[2]: *** [btl_uct_module.lo] Error 1
make[2]: Leaving directory '/home/usama/install/openmpi-4.0.1/opal/mca/btl/uct'
Makefile:2375: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/usama/install/openmpi-4.0.1/opal'
Makefile:1893: recipe for target 'all-recursive' failed
make: *** [all-recursive] Error 1
I used the following command to configure and then install.
./configure --prefix=/home/$USER/.openmpi --with-cuda
make all install
I am using following configuration:
Ubuntu 16.04
Cuda 10.1
CuDNN 7.5
Openmpi 4.0.1
The weird thing is I tried to do the same installation on my local machine with which has Ubuntu 18.04 and it installed and works fine. Is it some compatibility issue? Any thoughts?
Turns out it was a compatibility issue after all. Using openmpi 3.1.4 resolved the problem.