Search code examples
pythonmultithreadingpython-multithreadinglow-latency

python keep using same threads in long duration as long running process


I have a model that I want to keep running for over 6 hours paralleled:

pool = multiprocessing.Pool(10)
for subModel in self.model:
    pool.apply_async(self.compute, subModel, features)
pool.close()
pool.join()

The problem right now is too slow, as I have to call pool = multiprocessing.Pool(10) and pool.join() each time to construct and deconstruct 10 threads while the model is run tons of times in 6 hours.

The solution I believe is to have 10 threads kept running in the background, so that whenever new data coming in, it will go into model right away not worrying about creating new threads and deconstructing them which wastes a lot of time.

Is there a way in python that allows you to have long running process so that you don't need to start and stop over and over again?


Solution

  • You don't need to close() and join() (and destroy) the Pool after submitting one set of tasks to it. If you want to make sure that your apply_async() has completed before going on, just call apply() instead, and use the same Pool next time. Or, if you can do other stuff while waiting for the tasks, save the result object returned by apply_async and call wait() on it once you can't go on without it being complete.