Search code examples

Sharing mutable global variable in Python multiprocessing.Pool

I'm trying to update a shared object (a dict) using the following code. But it does not work. It gives me the input dict as an output.

Edit: Exxentially, What I'm trying to achieve here is to append items in the data (a list) to the dict's list. Data items give indices in the dict.

Expected output: {'2': [2], '1': [1, 4, 6], '3': [3, 5]}
Note: Approach 2 raise error TypeError: 'int' object is not iterable

  1. Approach 1

    from multiprocessing import *
    def mapTo(d,tree):
            for idx, item in enumerate(list(d), start=1):
    manager = Manager()
    sharedtree= manager.dict({"1":[],"2":[],"3":[]})
    with Pool(processes=3) as pool:
        pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])
  2. Approach 2
 from multiprocessing import *
 def mapTo(d):
         global tree
         for idx, item in enumerate(list(d), start=1):

 def initializer():
      global tree
      tree = dict({"1":[],"2":[],"3":[]})
 with Pool(processes=3, initializer=initializer, initargs=()) as pool:,data)```


  • You need to use managed lists if you want the changes to be reflected. So, the following works for me:

    from multiprocessing import *
    def mapTo(d,tree):
            for idx, item in enumerate(list(d), start=1):
    if __name__ == '__main__':
        with Pool(processes=3) as pool:
            manager = Manager()
            sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
            pool.starmap(mapTo, [(data,sharedtree ) for _ in range(3)])
        print({k:list(v) for k,v in sharedtree.items()})

    This is the ouput:

    {'1': [1, 1, 1, 4, 4, 4, 6, 6, 6], '2': [2, 2, 2], '3': [3, 3, 5, 3, 5, 5]}

    Note, you should always use the if __name__ == '__main__': guard when using multiprocessing, also, avoid starred imports...


    You have to do this re-assignment if you are on Python < 3.6, so use this for mapTo:

    def mapTo(d,tree):
            for idx, item in enumerate(list(d), start=1):
                l = tree[str(item)]
                tree[str(item)] = l

    And finally, you aren't using starmap/map correctly, you are passing the data three times, so of course, everything gets counted three times. A mapping operation should work on each individual element of the data you are mapping over, so you want something like:

    from functools import partial
    from multiprocessing import *
    def mapTo(i_d,tree):
        idx,item = i_d
        l = tree[str(item)]
        tree[str(item)] = l
    if __name__ == '__main__':
        with Pool(processes=3) as pool:
            manager = Manager()
            sharedtree= manager.dict({"1":manager.list(), "2":manager.list(),"3":manager.list()})
  , tree=sharedtree), list(enumerate(data, start=1)))
        print({k:list(v) for k,v in sharedtree.items()})