Search code examples
pythonpython-3.xmultiprocessingglobal-variablespython-3.6

Is it possible to have multiple processes write to the same dictionary using Pool.map()?


I'm using python3.6 and am trying to execute the something similar to the following code. However, after execution the variable mydict remains {}

I've tried this with global mydict and without. I thought that dicts were global by default, but neither seems to work.

mydict = {}
def TEST(hello, integer):
    global mydict
    mydict[integer] = hello
    print(integer)
with closing(Pool(processes=4)) as pool:
    pool.starmap(TEST, [['Hello World', i] for i in range(200)])

Is it possible to have multiple processes write to the same dictionary in python?


Solution

  • Is is possible to have multiple processes write to the same dictionary in python?

    No, it's not possible to share dictionaries between processes because each one runs in a separate memory-space with the Python code involved being interpreted by different copies of the interpreter—although that can be done via shared memory for some other data-types.

    However, it can be simulated by using a multiprocessing.Manager() to coordinate updates to certain kinds of shared objects—and one of the supported types is dict.

    This is discussed in the Sharing state between processes section of the online documentation. Using a Manager involves a lot of overhead because they're run as a separate server process in parallel with any other processes your code creates.

    Anyway, here's a working example based on the code in your question that uses one to manage concurrent updates to a shared dictionary. Since what the TEST() function does is so trivial, it is quite possible that doing it this way is slower than it would be not using multiprocessing, due to all the extra overhead it entails—however something like this would likely be appropriate for much more computationally-intensive tasks.

    from contextlib import closing
    from multiprocessing import Pool, Manager
    
    def TEST(mydict, hello, integer):
        mydict[integer] = hello
        print(integer)
    
    if __name__ == '__main__':
    
        with Manager() as manager:
            my_dict = manager.dict()
    
            with closing(Pool(processes=4)) as pool:
                pool.starmap(TEST, ((my_dict, 'Hello World', i) for i in range(200)))