I have a problem debugging MPI application with lldb
. Essentially, i attach it to every process via
mpirun_lldb() {
mpirun --mca orte_base_help_aggregate 0 --mca mpi_abort_print_stack 1 -np $1 xterm -hold -e lldb -f $2 -- "${@:3}"
}
but at some point MPI_Abort
happens and all xterm
windows are being closed immediately and I can't even read the stack, leave alone debugging and inspecting variables:
MPI_ABORT was invoked on rank 2 in communicator MPI_COMM_WORLD
with errorcode 255.
NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
I tried using -hold
for xterm
but this does not help.
p.s. I don't have access to licensed debuggers like TotalView. It's a first time i have a problem with a simple method described above.
sorry for noise, adding a breakpoint b MPI_Abort
solved the issue.