memory memory-management cuda gpu shared-memory

Can two processes share the same GPU memory? (CUDA)

In the CPU world one can do it via memory map. Can similar things be done for the GPU?

If two processes can share the same CUDA context, I think it will be trivial - just pass the GPU memory pointer around. Is it possible to share the same CUDA context between two processes?

Another possibility I could think of is to map device memory to a memory mapped host memory. Since it's memory mapped, it can be shared between two processes. Does this make sense / is possible, and are there any overheads?

Solution

CUDA MPS effectively allows CUDA activity emanating from 2 or more processes to behave as if they share the same context on the GPU. (For clarity: CUDA MPS does not cause two or more processes to share the same context. However the work scheduling behavior appears similar to what you would observe if the work were emanating from the same process and therefore the same context.) However this won't provide for what you are asking for:

can two processes share the same GPU memory?

One method to achieve this is via CUDA IPC (interprocess communication) API.

This will allow you to share an allocated device memory region (i.e. a memory region allocated via cudaMalloc) between multiple processes. This answer contains additional resources to learn about CUDA IPC.

However, according to my testing, this does not enable sharing of host pinned memory regions (e.g. a region allocated via cudaHostAlloc) between multiple processes. The memory region itself can be shared using ordinary IPC mechanisms available for your particular OS, but it cannot be made to appear as "pinned" memory in 2 or more processes (according to my testing).