Search code examples
pythonpython-multiprocessing

How to share the memory with multiprocessing.Pool's workers, without using global variables?


There are two functions:

def tibidam(..., foo_dyn, index):
    print("(" + str(index) + ") B:", foo_dyn)

    for i in range(...):
        for j ...
            if j not in foo_dyn:
                foo_dyn[ep] = j

    print("(" + str(index) + ") A:", foo_dyn)

def tata(..., foo):
    foo_dyn = Array('i', len(foo))
    
    foo_dyn = foo

    with Pool(processes=4) as pool:
        pool.starmap(tibidam, [(..., foo_dyn, i) 
            for i in range(4)])
    
    return foo

Output (formatted):

foo  : [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(0) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(1) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(2) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(3) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(0) A: [27, 1,  2,  3,  64, 5,  6,  7,  80, 9,  10, 11]
(2) A: [0,  1,  64, 3,  4,  5,  13, 7,  8,  9,  92, 11]
(3) A: [0,  1,  2,  31, 4,  5,  6,  73, 8,  9,  10, 18]
(1) A: [0,  18, 2,  3,  4,  27, 6,  7,  8,  99, 10, 11]
...

Expected output (formatted):

foo  : [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(0) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(1) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(2) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(3) B: [0,  1,  2,  3,  4,  5,  6,  7,  8,  9,  10, 11]
(0) A: [27, 1,  2,  3,  64, 5,  6,  7,  80, 9,  10, 11]
(2) A: [27, 1,  55, 3,  64, 5,  13, 7,  80, 9,  92, 11]
(3) A: [27, 1,  55, 31, 64, 5,  13, 73, 80, 9,  92, 18]
(1) A: [27, 87, 55, 31, 64, 88, 13, 73, 80, 99, 92, 18]
...

How I can change foo_dyn in all workers, whenever foo_dyn changes? It seems, that pool.starmap(...) creates a copy of the foo_dyn, for each process... No, I want pass foo_dyn to the pool only once. But, again, without using global variables, at all.

AFAIK, multiprocessing.Pool supports initializer and initargs arguments: I can write own initializer:

_init(foo):
    global foo_dyn

    foo_dyn = foo

, but it uses global variable foo_dyn (by the way, using _init function doesn't solve the problem). In passing, I saw a few questions, with almost the same trouble. However, all solutions were associated with using global variables.


Solution

  • I found the solution, without using the global variables:

    from multiprocessing import Pool, Manager
    
    def tibidam(..., foo_dyn, index):
        for i in range(...):
            for j ...
                if j not in foo_dyn:
                    foo_dyn[ep] = j
    
    def tata(..., foo):
        foo_dyn = Manager().list(foo)
    
        with Pool(processes=4) as pool:
            pool.starmap(tibidam, [(..., foo_dyn, i)
                for i in range(4)])
    
        return foo_dyn
    

    Thank you all! :>