Search code examples
pythonmultiprocessingshared-memorypython-zip

Multiprocess or multiprocessing not working [EDIT] list(zip(a,b,)) behaviour


the following code

import multiprocessing as mp
from multiprocessing.managers import SyncManager

n_cores = mp.cpu_count()

def parallel_fn(job_n, cache):
    cache['job_b'] = job_n
    return job_n

if __name__=="__main__":
    with SyncManager() as manager:

        shared_cache = manager.dict()
        
        args = list(zip(range(n_cores), shared_cache))

        with mp.Pool(n_cores) as pool:

            result = pool.starmap(parallel_fn, args)

            print(result)
        
        print(shared_cache)

returns

16
Shared dict before: {}
Pool return: []
Shared dict after: {}

I was expecting 16 values in return from the pool and 16 values in the shared dictionary, but both are empty, anyone can help me out?


Solution

  • Multiprocessing is a red herring in this case. If you print args after defining it, you'll see an empty list. You'll want to fix the zip line as follows to create a list of tuples. zip returns the shorter of the two (or more) items. In this case, you have a range object of length 16 and a ProxyDict objectof length 0 (it's empty to start). As a small example, check out: list(zip([1, 2], dict())) which returns a list of length 0.

    Also, I'm guessing that you wanted to put job_n as the name in the cache dictionary.

    import multiprocessing as mp
    from multiprocessing.managers import SyncManager
    
    n_cores = mp.cpu_count()
    
    def parallel_fn(job_n, cache):
        # Change 'job_b' to job_n
        cache[job_n] = job_n
        return job_n
    
    if __name__=="__main__":
        with SyncManager() as manager:
    
            shared_cache = manager.dict()
            
            # Create a list of tuples to use as args in starmap
            args = [(n, shared_cache) for n in range(n_cores)]
    
            with mp.Pool(n_cores) as pool:
                result = pool.starmap(parallel_fn, args)
    
                print(result)
            
            print(shared_cache)
    

    On my 8-core machine, the output is:

    [0, 1, 2, 3, 4, 5, 6, 7]
    {0: 0, 1: 1, 2: 2, 3: 3, 4: 4, 5: 5, 6: 6, 7: 7}