Search code examples
pythonmultiprocessingthreadpoolpathos

Python distributed ThreadPool


Do any of the Python ThreadPool packages (multiprocessing, pathos) distribute worker threads across multiple cpus? If not would I need to need to create a ProcessPool, then create a ThreadPool on each ProcessPool worker?


Solution

  • I'm the pathos author. multiprocessing has a Pool that will distribute jobs to multiple cpus. Here's an example (using multiprocess, which is a fork of multiprocessing with better serialization):

    >>> import multiprocess as mp
    >>> mp.Pool().map(lambda x:x, range(4))
    [0, 1, 2, 3]
    >>> 
    

    pathos uses multiprocess in pathos.pools.ProcessPool. There's also pathos.pools.ParallelPool (from ppft) that can spawn jobs with distributed computing (not on the same machine).

    If you are looking to do a hierarchical pool (i.e. ThreadPool and ProcessPool), then you should use pathos for that. See here: https://stackoverflow.com/a/28491924/2379433 and https://stackoverflow.com/a/28613077/2379433 and https://stackoverflow.com/a/40852258/2379433.

    However, you could do it with multiprocessing... but I think it's no longer allowed: https://stackoverflow.com/a/8963618/2379433