I recently installed OpenMPI version 2.0 on my SGE cluster. But when I submit a job I get "Host ket verification failed". Even though I'm able to login to that node(compute10) without the password from the submit host.
The error in the output file:
Warning: no access to tty (Bad file descriptor). Thus no job control in this shell. Wed Jan 30 15:58:53 EST 2019 Host key verification failed. [file orca_main/gtoint.cpp, line 137]: ORCA finished by error termination in ORCA_GTOInt
My SGE script is below:
!/bin/tcsh
$ -q sge-queue@compute10
$ -pe mpi 8
$ -V
$ -cwd
$ -j y
$ -l h_vmem=64G
date
setenv OMP_NUM_THREADS 8
/home/user/orca_4_0_1_2_linux_x86-64_openmpi202/orca ccl3.inp > ccl3.out
date
And my parallel environment mpi:
pe_name mpi
slots 999
user_lists NONE
xuser_lists NONE
start_proc_args /export/sge6.2_U7/mpi/startmpi.sh -catch_rsh $pe_hostfile
stop_proc_args /export/sge6.2_U7/mpi/stopmpi.sh
allocation_rule $pe_slots
control_slaves TRUE
job_is_first_task FALSE
urgency_slots min
accounting_summary TRUE
After trying various things, updating OpenMPI to 3.1.0 version and building with the options below solved the issue.
./configure --prefix=/usr/local --with-sge --enable-orterun-prefix-by-default