Search code examples
pythonmultiprocessingpython-multiprocessingpoolconcurrent-processing

How to launch 100 workers in multiprocessing?


I am trying to use python to call my function, my_function() 100 times. Since my_function takes a while to run, I want to parallelize this process.

I tried reading the docs for https://docs.python.org/3/library/multiprocessing.html but could not find an easy example to get started with launching 100 workers. Order does not matter; I just need the function to run 100 times.

Any suggestions/code tips?


Solution

  • The literally first example on the page you link to works. So I'm just going to copy and paste it here and change two values.

    from multiprocessing import Pool
    
    def f(x):
        return x*x
    
    if __name__ == '__main__':
        with Pool(100) as p:
            print(p.map(f, range(100)))
    

    EDIT: you just said that you're using Google colab. I think google colab offers you two cpu cores, not more. (you can check by running !cat /proc/cpuinfo). With 2 cpu cores, you can only execute two pieces of computation at once.

    So, if your function is not primarily something that waits for external IO (e.g. from network), then this makes no sense: you've got 50 executions competing for one core. The magic of modern multiprocessing is that this means that suddenly, one function will be interrupted, its state saved to RAM, the other function then may run for a while, gets interrupted, and so on.

    This whole exchanging of state of course is overhead. You'd be faster just running as many instances your function in parallel as you have cores. Read the documentation on Pool as used above for more information.