Search code examples
pythonsqlalchemymultiprocessingconnection-pooling

How to share a DB connection pool in Python multiprocessing?


I'm using multiprocessing.Pool to execute some function. And in the function I need to connect to database (using sqlalchemy). I try to share the sqlalchemy connection pool to child processes by using multiprocessing.Queue, like this:

from multiprocessing import Pool, Manager

def process(data, queue):
    db = queue.get()
    with db.connect() as connection:
        # execute some query

data_list = [] # list of data I'm going to deal with
pool = Pool(8)
manager = Manager()
queue = manager.Queue()
db = sqlalchemy.create_engine()
for data in data_list:
    queue.put(db)
    pool.apply_async(func=process, args=(data, db)) # This is 1st way I try
    pool.apply_async(func=process, args=(data, queue)) # This is 2nd way I try

I try these two way, but they both raise error.

The first way would raise a Broken Pipe Error when executing with db.connect() as connection.

The second way it would raise a Can't pickle local object 'create_engine.<locals>.connect' error.

I search this problem and found some people said sharing connection pool to child processes is feasible, but how should I share the engine with multiprocess?


Solution

  • As @charchit's comment, here is documentation about using sqlalchemy with multiprocessing