multithreading jvm cluster-computing openmpi hpc

reducing number of JVM's used by mpi

I'm writing a program in java to run on a HPC using MPI, but I have this feeling I'm wasting the resources because of the way open-mpi works. I'm actually not at all sure about how it works, but I assume mpirun -n 10 java myProgram.java starts 10 times myProgram on different nodes/cores/etc. (depending on the binding) every time. This also means, for as far as my understanding of JVM reaches, 10 JVMs are running.

After some test-runs with my program, using the default settings (binding to cores and packing everything as much as possible on 1 node), I realised memory-usage was very poor and came to the conclusion I would have to do something about the multiplicity of the JVMs. I tried to loosen the binding from cores to nodes, but then I don't use all possible processing power. I already tried to use multithreading to solve that problem as well, but I read (somew)here that's not one of the best ideas either (but I'm still trying to find out a way around).

So my question is:
Does there exist a way to link every node with only one JVM, using open-mpi?
alternative: How can I make better use of the memory?

Thanks in advance

Solution

As Hristo Iliev already mentioned in the comments, the hybrid approach solves my problem. To avoid having to use MPI_THREAD_MULTIPLE, I distributed the work statically. The program can then be run with mpirun --map-by ppr:1:node --bind-to board java <java-program> in order to limit the number of processes run on one node.