I'm using multiprocessing.Pool to execute some function. And in the function I need to connect to database (using sqlalchemy). I try to share the sqlalchemy connection pool to child processes by using multiprocessing.Queue, like this:
from multiprocessing import Pool, Manager
def process(data, queue):
db = queue.get()
with db.connect() as connection:
# execute some query
data_list = [] # list of data I'm going to deal with
pool = Pool(8)
manager = Manager()
queue = manager.Queue()
db = sqlalchemy.create_engine()
for data in data_list:
queue.put(db)
pool.apply_async(func=process, args=(data, db)) # This is 1st way I try
pool.apply_async(func=process, args=(data, queue)) # This is 2nd way I try
I try these two way, but they both raise error.
The first way would raise a Broken Pipe Error
when executing with db.connect() as connection
.
The second way it would raise a Can't pickle local object 'create_engine.<locals>.connect'
error.
I search this problem and found some people said sharing connection pool to child processes is feasible, but how should I share the engine with multiprocess?
As @charchit's comment, here is documentation about using sqlalchemy with multiprocessing