Say I have 50 processes, and I'm using them to operate on (say) 20000 different input values. (I'm using the pathos library, which I think operates similarly to the multiprocessing library in Python.)
thread_pool = pathos.multiprocessing.ProcessingPool(threads=50)
thread_pool.map(function, inputs)
I want to create one SQLAlchemy database engine for each process (but I don't have the resources to create one for each input value). Then I want all inputs that are processed using that process to work with the same database engine.
How can I do this?
I'm the author of both pathos
and multiprocess
. It turns out that multiprocess
is actually what pathos
is using, but maybe it's not obvious that is the case. You can do it from pathos
:
>>> import pathos
>>> pathos.pools._ProcessPool
<class 'multiprocess.pool.Pool'>
The above is the raw Pool
directly from multiprocess
, while pathos.pools.ProcessPool
is a higher-level wrapper with some additional features, but does not (yet) expose all the keyword arguments from the lower-level Pool
.