I have a model that I want to keep running for over 6 hours paralleled:
pool = multiprocessing.Pool(10)
for subModel in self.model:
pool.apply_async(self.compute, subModel, features)
pool.close()
pool.join()
The problem right now is too slow, as I have to call pool = multiprocessing.Pool(10) and pool.join() each time to construct and deconstruct 10 threads while the model is run tons of times in 6 hours.
The solution I believe is to have 10 threads kept running in the background, so that whenever new data coming in, it will go into model right away not worrying about creating new threads and deconstructing them which wastes a lot of time.
Is there a way in python that allows you to have long running process so that you don't need to start and stop over and over again?
You don't need to close()
and join()
(and destroy) the Pool
after submitting one set of tasks to it. If you want to make sure that your apply_async()
has completed before going on, just call apply()
instead, and use the same Pool
next time. Or, if you can do other stuff while waiting for the tasks, save the result object returned by apply_async
and call wait()
on it once you can't go on without it being complete.