Search code examples
pythonc++ipcshared-memory

Python Shared Memory using mmap and empty files


I'm trying to make a fast library for interprocess communication between any combination of Python and C/C++ processes. (i.e. Python <-> Python, Python <-> C++, or C++ <-> Python)

In the hopes of having the fastest implementation, I'm trying to utilize shared memory using mmap. The plan is for two processes to share memory by "mmap-ing" the same file and read from and write to this shared memory to communicate.

I want to avoid any actual writes to a real file, and instead simply want to use a filename as a handle for the two processes to connect. However, I get hung up on the following call to mmap:

self.memory = mmap.mmap(fileno, self.maxlen)

where I get the following error:

FileNotFoundError: [Errno 2] No such file or directory: 'shared_memory_file'

or if I make an empty file:

ValueError: mmap length is greater than file size

Do I need to simply make an empty file filled with nulls in order to be able to use shared memory like this?

How can I use mmap for shared memory in Python between unrelated processes (not parent<->child communication) in a way which C++ can also play along? (not using multiprocessing.shared_memory)


Solution

  • To answer the questions directly as best I can:

    • The file needs to be sized appropriately before it can be mapped. If you need more space, there are different ways to do it ... but most portable is likely unmap the file, resize the file on disk, and then remap the file. See: How to portably extend a file accessed using mmap()

    • You might be able to mmap with MAP_ANONYMOUS|MAP_SHARED, then fork, then run with the same shared memory in both processes. See: Sharing memory between processes through the use of mmap()

    • Alternatively, you could create a ramdisk, create a file there of a specific size, and then mmap into both processes.

    • Keep in mind that you'll need to deal with synchronization between the two processes - different platforms might have different approaches to this, but they traditionally involve using a semaphore of some kind (e.g. on Linux: https://man7.org/linux/man-pages/man7/sem_overview.7.html).

    All that being said, traditional shared memory will probably do better than mmap for this use-case. In general, OS-level IPC mechanisms are likely to do better out of the box than hand-rolled solutions - there's a lot of tuning that goes into something to make it perform well, and mmap isn't always an automatic win.

    Good luck with the project!