Search code examples
pythonpython-3.xdictionarymultiprocessingpython-multiprocessing

Modifying outter dict inside a multiprocessing pool


I'm trying to modify a dictionary (file) with a multiprocessing pool. However, I can't make it happen.

Here is what I'm trying:

import json
import multiprocessing



def teste1(_dict, _iterable):
    file1[f'{_iterable}'] = {'relevant': True}


file1 = {'item1': {'relevant': False}, 'item2': {'relevant': False}}

pool = multiprocessing.Pool(4)
manager = multiprocessing.Manager()
dicto = manager.dict()
pool.apply_async(teste1, (file1, file1))
print(file1)

However, it's still printing out the same as before: {'item1': {'relevant': False}, 'item2': {'relevant': False}}

Could one noble soul help me out with this?


Solution

  • There are multiple issues with your approach:

    1. You are attempting to share a dictionary (file1) across a number of processes but you are actually sharing a copy of it. In order to solve this please refer to: multiprocessing: How do I share a dict among multiple processes?

    2. You are iterating over the copies of the dictionaries. Trying to index with a dictionary itself!

    Assuming that what you need is a dictionary with values updated by parallel processes, you have two choices:

    A. Share the dictionary across processes and iterate over keys like:

    pool.apply_async(teste1, file1.keys())  # assuming file1 is properly shared
    

    B. Simpler approach where you construct the resulting dictionary based on the return values from parallel run teste1 function:

    def teste1(dict_key):
        # some logic dependent on dict_key
        return {'relevant': True}
    
    
    file1 = {'item1': {'relevant': False}, 'item2': {'relevant': False}}
    
    pool = multiprocessing.Pool(4)
    manager = multiprocessing.Manager()
    dicto = manager.dict()
    results = pool.map(teste1, file1.keys())
    pool.close()
    pool.join()
    
    file2 = {k:v for k,v in zip(file1.keys(), results)}  # file1.keys() preserves the order so results and file1.keys() are corresponding
    print(file2)