I use a MPI
(mpi4py
) script (on a single node), which works with a very large object. In order to let all processes have access to the object, I distribute it through comm.bcast()
. This copies the object to all processes and consumes a lot of memory, especially during the copying process. Therefore, I would like to share something like a pointer instead of the object itself. I found some features in memoryview
useful to boost work with the object inside a process. Also the object's real memory address is accessible through the memoryview
object string representation and can be distributed like this:
from mpi4py import MPI
comm = MPI.COMM_WORLD
rank = comm.Get_rank()
if rank:
content_pointer = comm.bcast(root = 0)
print(rank, content_pointer)
else:
content = ''.join(['a' for i in range(100000000)]).encode()
mv = memoryview(content)
print(mv)
comm.bcast(str(mv).split()[-1][: -1], root = 0)
This prints:
<memory at 0x7f362a405048>
1 0x7f362a405048
2 0x7f362a405048
...
That's why I believe that there must be a way to reconstitute the object in another process. However, I cannot find a clue in the documentation about how to do it.
In short, my question is: Is it possible to share an object between processes on the same node in mpi4py
?
I don't really know much about mpi4py, but this should not be possible from the MPI point of view. MPI stands for Message Passing Interface, which means exactly that: pass messages around between processes. You could try and use MPI One-sided communication to resemble something like a globally accessible memory, but otherwise process memory is unavailable to other processes.
If you need to rely on a large block of shared Memory, you need to utilize something like OpenMP or threads, which you absolutely could use on a single node. A hybrid parallelization with MPI and some shared memory parallelization would allow you to have one shared memory block per node, but still the option to utilize many nodes.