Search code examples
pythonmultiprocessingshared-memory

Python `ShareableList` is removed when "reading" process closes


I have a "main" process (write-only) in Python that generates a list of IDs which I would like to share with other independently created Python processes (read-only) on the same machine. I want the list to persist regardless if any of the "read" processes exit. I have been exploring ShareableList and SharedMemory from multiprocessing to see if it's suitable for this use case, but encountered some behavior I did not expect. The following is a script I wrote to test this out.

shareable_list.py

import argparse
from multiprocessing import shared_memory


def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument("--name", type=str, default="shared-memory-test",
                        help="name of shared memory block")
    parser.add_argument("--process-type", type=str, default="read", help="If 'write', then "
                        "write to shared memory. If 'read', then read from shared memory.")
    return parser.parse_args()


if __name__ == "__main__":
    args = parse_args()
    max_seq_len = 64

    if args.process_type == "write":
        # initialize shared memory (preallocate in case I need to add more)
        share = shared_memory.ShareableList(sequence=["" for _ in range(max_seq_len)],
                                            name=args.name)
        print(f"created shared memory block {args.name} with sequence length {max_seq_len}")
        for i, data in enumerate(["a", "b", "c", "d", "e"]):
            data = str(i)
            print(f"writing {data} to shared memory")
            share[i] = data
    elif args.process_type == "read":
        # read data from shared_memory
        share = shared_memory.ShareableList(name=args.name)
        for i, data in enumerate(share):
            if data:
                print(f"read {data} from shared memory index {i}")
    else:
        raise ValueError(f"invalid process_type: {args.process_type}")

    # stall until user quits
    input("Press enter to quit:")

    # close shared memory
    share.shm.close()
    if args.process_type == "write":
        share.shm.unlink()
        print(f"unlinked shared memory")

Here is how I tested it:

  1. Run python shareable_list.py --process-type write to create and fill a ShareableList object. Let this process continue.
  2. Open a new shell and run python shareable_list.py --process-type read
  3. Open a third shell and run python shareable_list.py --process-type read

The first process outputs the following (which is expected):

created shared memory block shared-memory-test with sequence length 64
writing 0 to shared memory
writing 1 to shared memory
writing 2 to shared memory
writing 3 to shared memory
writing 4 to shared memory

The second and third processes output this (also expected):

read 0 from shared memory index 0
read 1 from shared memory index 1
read 2 from shared memory index 2
read 3 from shared memory index 3
read 4 from shared memory index 4

However, when I close the second or third process by pressing "enter" I receive the following warning:

UserWarning: resource_tracker: There appear to be 1 leaked shared_memory objects to clean up at shutdown

It also seems to remove the shared memory block. After closing a "read" process, opening any new "read" processes or closing the "write" process results in the following error:

FileNotFoundError: [Errno 2] No such file or directory: '/shared-memory-test'

After reading the docs on close() and unlink(), I assumed that I would want to call close() before the "read" processes end and call close() and unlink() before the "write" process ends. My best guess is that the "read" processes here think they are the only processes tracking the object and shut it down because of this. Is my understanding incorrect here? Is this even a good approach to solving my problem? Thanks.


Solution

  • It appears as if this is a bug in the Python library.

    See this answer

    The bug is reported here

    It looks like it will be fixed (or worked around) in Python 3.13, which is planned to be released in October 2024, and will require an additional parameter track=True in the "reading" process.

    Although it's easy to reproduce this issue, I have also found that it does not always occur.