Search code examples
pythonmultiprocessingshared-memory

multiprocessing: Can a dict be shared between two Python shell?


I come from this post multiprocessing: How do I share a dict among multiple processes? but I want something slightly different. In that post, a dict is shared between a parent process and its child which is instantiated by the constructor Process. What I want is to share a dict between two Python shell.


Solution

  • What you would like to do is use a managed dictionary. The problem is that multiple processes will acquire their own instance of of a sharable dictionary if they use code such as:

    from multiprocessing import Manager
    
    with Manager() as manager:
        sharable_dict = manager.dict()
        ... # etc.
    

    Instead we crate a new managed dictionary of type sharable_dict that always returns a proxy to the same singleton managed dictionary. For this to work, all processes must connect to a common SyncManager server:

    File test2.py (your actual processing code)

    from multiprocessing.managers import BaseManager
    from multiprocessing import current_process
    
    address = "127.0.0.1"
    port = 50000
    password = "secret"
    
    def connect_to_manager():
        BaseManager.register('sharable_dict')
        manager = BaseManager(address=(address, port), authkey=password.encode('utf-8'))
        manager.connect()
        return manager.sharable_dict()
    
    if __name__ == '__main__':
        sharable_dict = connect_to_manager()
        pid = current_process().pid
        print('My pic =', pid)
        sharable_dict[pid] = True
    

    The above code gets a common, sharable dictuonary to which, for demo purposes, just adds a key that is the current process id.

    File test.py

    This file creates the managed, sharable dictionary that can be served up to any process wanting to use it and then using subprocess.Popen runs multiple external processes (i.e. test2.py). Finally, this code prints out the sharable dictionary to show that all 3 external processes:

    from multiprocessing.managers import BaseManager, DictProxy
    from threading import Thread, Event
    from test2 import address, port, password, connect_to_manager
    from subprocess import Popen
    
    the_dict = None
    
    def get_dict():
        global the_dict
    
        if the_dict is None:
            the_dict = {}
        return the_dict
    
    def server(started_event, shutdown_event):
        net_manager = BaseManager(address=(address, port), authkey=password.encode('utf-8'))
        BaseManager.register('sharable_dict', get_dict, DictProxy)
        net_manager.start()
        started_event.set() # tell main thread that we have started
        shutdown_event.wait() # wait to be told to shutdown
        net_manager.shutdown()
    
    def main():
        started_event = Event()
        shutdown_event = Event()
        server_thread = Thread(target=server, args=(started_event, shutdown_event,))
        server_thread.start()
        # wait for manager to start:
        started_event.wait()
    
        processes = [Popen(['python', 'test2.py']) for _ in range(3)]
        for process in processes:
            process.communicate()
    
        sharable_dict = connect_to_manager()
        print('sharable dictionary =', sharable_dict)
    
        # tell manager we are through:
        shutdown_event.set()
        server_thread.join()
    
    if __name__ == '__main__':
        main()
    

    Prints:

    My pic = 18736
    My pic = 12476
    My pic = 10584
    sharable dictionary = {18736: True, 12476: True, 10584: True}
    

    Update

    This sharable dictionary, of course, only works with processes executing Python scripts. Assuming you are able to modify those scripts to connect to the server to get the sharable dictionary, then perhaps you can just place the code that needs to be executed in a "worker" function, e.g. worker, and create processes using multiprocessing.Process instances. This would result in far simpler code.

    Your worker function:

    File test2.py

    from multiprocessing import current_process
    
    def worker(sharable_dict):
        pid = current_process().pid
        print('My pid =', pid)
        sharable_dict[pid] = True
    

    And the code to create a sharable dictionary and create child processes that use it:

    File test.py

    from multiprocessing import Manager, Process
    from test2 import worker
    
    def main():
        with Manager() as manager:
            sharable_dict = manager.dict()
            processes = [Process(target=worker, args=(sharable_dict,)) for _ in range(3)]
            for process in processes:
                process.start()
            for process in processes:
                process.join()
    
            print('sharable dictionary =', sharable_dict)
    
    if __name__ == '__main__':
        main()