Search code examples
python-3.xpostgresqlpsycopg2psql

What is correct way to use psycopg2 cursors in threads?


I feel the answer to my question is in these two SO questions, but I find the answers very poorly worded (or above my pay grade)

  1. multi thread python psycopg2
  2. Are transactions in PostgreSQL via psycopg2 per-cursor or per-connection?

Question: What is the correct way to use psycopg2 to ensure it is thread safe

Option 1: Each thread has its own cursor

import threading
import psycopg2

conn = psycopg2.connect (
    host=127.0.0.1,
    user='john',
    password='1234',
    dbname='foo',
    port=1234)

class Foo (threading.Thread):
    def __init__ (self):
        threading.Thread.__init__(self)

    def run (self):
        global conn

        cur = conn.cursor()
        sql_query="SELECT * from foo;"
        print(cur.execute (sql_query))
        conn.commit()

num_threads = 100
threads = []

for i in seq (num_threads):
    threads.append (Foo())

for i in seq (num_threads):
    threads[i].start()

for i in seq (num_threads):
    threads[i].join()

Option 2: Each thread has it's own connection

import threading
import psycopg2

db_conn = psycopg2.connect (
    host=127.0.0.1,
    user='john',
    password='1234',
    dbname='foo',
    port=1234)

class Foo (threading.Thread):
    def __init__ (self):
        threading.Thread.__init__(self)
        self.conn = psycopg2.connect (
            host=127.0.0.1,
            user='john',
            password='1234',
            dbname='foo',
            port=1234)

    def run (self):
        cur = self.conn.cursor()
        sql_query="SELECT * from foo;"
        print(cur.execute (sql_query))
        conn.commit()

num_threads = 100
threads = []

for i in seq (num_threads):
    threads.append (Foo())

for i in seq (num_threads):
    threads[i].start()

for i in seq (num_threads):
    threads[i].join()

Solution

  • Each thread should have its own database connection.

    A PostgreSQL connection can handle only one statement at a given time (unless you are using a server side cursor, but even then the connection can handle only one FETCH at the same time).

    So if several threads were to share a database connection, they'd have to coordinate carefully to make sure that only one thread uses the connection at the same time. For example, you cannot send a new query while another thread is still waiting for a query result.