Search code examples
pythonfor-loopparallel-processingjoblib

parallel for loop python


I would like to parallelize a for loop in python.

The loop gets fed by a generator and I expect 1 billion items.

It turned out, that joblib has a giant memory leak

Parallel(n_jobs=num_cores)(delayed(testtm)(tm) for tm in powerset(all_turns))

I do not want to store data in this loop, just print sometimes something out, but the main thread grows in seconds to 1 GB size.

Are there any other frameworks for a large number of iterations?


Solution

  • from multiprocessing import Pool
    
    if __name__ == "__main__":
       pool = Pool() # use all available CPUs
       for result in pool.imap_unordered(delayed(testtm), powerset(all_turns),
                                         chunksize=1000):
           print(result)