Search code examples
pythonmultiprocessingpython-multiprocessing

How to use multiprocessing.Pool in an imported module?


I have not been able to implement the suggestion here: Applying two functions to two lists simultaneously.

I guess it is because the module is imported by another module and thus my Windows spawns multiple python processes?

My question is: how can I use the code below without the if if __name__ == "__main__":

args_m = [(mortality_men, my_agents, graveyard, families, firms, year, agent) for agent in males]
args_f = [(mortality_women, fertility, year, families, my_agents, graveyard, firms, agent) for agent in females]

with mp.Pool(processes=(mp.cpu_count() - 1)) as p:
    p.map_async(process_males, args_m)
    p.map_async(process_females, args_f)

Both process_males and process_females are fuctions. args_m, args_f are iterators

Also, I don't need to return anything. Agents are class instances that need updating.


Solution

  • The idea of if __name__ == '__main__': is to avoid infinite process spawning.

    When pickling a function defined in your main script, python has to figure out what part of your main script is the function code. It will basically re run your script. If your code creating the Pool is in the same script and not protected by the "if main", then by trying to import the function, you will try to launch another Pool that will try to launch another Pool....

    Thus you should separate the function definitions from the actual main script:

    from multiprocessing import Pool
    
    # define test functions outside main
    # so it can be imported withou launching
    # new Pool
    def test_func():
        pass
    
    if __name__ == '__main__':
        with Pool(4) as p:
            r = p.apply_async(test_func)
            ... do stuff
            result = r.get()