Search code examples
pythonpython-multiprocessingcx-oracle

How to run parallel queries using multiporcessing pool and cx_oracle


Am trying to run multiple processes using python's Pool, each process will be inquiring database this is why am trying to pass the connection to each process and then initiating a new cursor for query execution

db = DBConnection()

sessoion_pool, cursor= db.get_session_pool()
connection = sessoion_pool.acquire()

part_list = [e for l in get_parts_list() for e in l]

pool = Pool()
pool.map(scan_part_best_model, (part_list, connection))

yet it keeps throwing TypeError: can't pickle cx_Oracle.Connection objects I also tried doing same for Session and using session.acquire() in the function logic, but it throws same error


Solution

  • Connections cannot be shared between processes. You can perform the parallel queries in the same process, however, using code like the following:

    pool = cx_Oracle.SessionPool(USER, PASSWORD, DSN, min=5, max=20, increment=5)
    
    def do_query(part_num):
        with pool.acquire() as conn:
            cursor = conn.cursor()
            cursor.execute("select some_stuff from some_table where part_num = :1",
                    [part_num])
            print("Fetched part information for part", part_num)
    
    threads = [threading.Thread(target=do_query, args=(p,)) for p in get_parts_list()]
    for t in threads:
        t.start()
    for t in threads:
        t.join()
    

    Once you have fetched the data from the database, if you have significant work to do you can pass the fetched data to subprocesses via multiprocessing to do that work. You just can't perform any database work in a subprocess -- at least not without first connecting in that subprocess.