python postgresql psycopg2 google-cloud-sql google-cloud-run

Optimal PostgreSQL connection method with Python in serverless environment?

I have developed a Cloud Run service using Python with Flask that accepts incoming HTTP requests. Once a request is made, the payload is send to a class with a context manager. In that class, upon entering, a database connection is established. Upon exiting, it is closed, to ensure there's no connection leaks.

Here is how the database connection is set up in the class's enter method:

if self.db_connection is None:
    try:
        self.db_connection = psycopg2.connect(
            host=
            dbname=
            user=
            password=
            )
        self.db_cursor = self.db_connection.cursor()
    except Exception as e:
        print(f'Error connecting to database: {e}')
        return

We are using Google Cloud SQL with a PostgreSQL V15.2 instance, connecting using the psycopg2 Python library. Because of this, there's a maximum number of active connections per database which is 100. Therefore we limit the amount of active HTTP requests/tasks to less than that, around 80, which is making the workflow slower.

There are definitely better ways, but I am not very familiar with them yet. One such way seems to be connection pooling, but I'm not yet sure about the best way to go about it. Because we have multiple active instances and connect to the database during a 'task', if we create a connection pool, won't multiple pools be created with each having a number of reserved connections? How would we go about making a single pool for, say, 80 connections that are re-used? Would a single pool with 80 connections then take up 80% of the max_connections per database for the Cloud SQL database?

What's the best approach in this situation? We'll probably reconfigure the database to increase the MAX_CONNECTIONS parameter, but it's probably also a good idea to change the way we connect to it at the same time.

Specifications/information:

Python 3.9
Psycopg2
Cloud Run using Flask
Tasks created using Cloud Tasks with the Cloud Run and Flask app route as HTTP targets
Number of HTTP requests (tasks), with each connecting and disconnecting from the database, will be a minimum of 10,000 per day but likely more, possibly up to 100,000/day
PostgreSQL 15.2 on Google Cloud SQL
All hosted on Google Cloud Platform inside of a VPC network

We haven't tried a different way of connecting, but the possible solutions/information I've discovered so far are:

The various pooling methods in psycopg2: Abstract, Simple, Threaded, and Persistent
Possible other connection methods in Cloud SQL as described in this documentation, however, we'd like to continue using the psycopg2 library and it doesn't seem like this will work

Solution

I would definitely recommended leveraging a connection pool for your use-case to take advantage of re-using connections and reduce overhead with upwards of 100,000 connections. A popular connection pooling library for Python that supports psycopg2 is SQLAlchemy.

You could establish a global connection pool similar to your code above that could be shared across class objects.

The code for SQLAlchemy to establish a psycopg2 connection engine (pool) is as follows:

import sqlalchemy

pool = None

def init_connection_pool(user, password, ip_address, dbname):
  return sqlalchemy.create_engine(
      # Equivalent URL:
      # postgresql+psycopg2://<db_user>:<db_pass>@<db_host>:<db_port>/<db_name>
      sqlalchemy.engine.url.URL.create(
          drivername="postgresql+psycopg2",
          username=user,
          password=password,
          host=ip_address,
          port=5432,
          database=dbname,
      ),
      # requiring SSL is recommended for Private IP connections
      connect_args={'sslmode': 'require'},
      # Pool size is the maximum number of permanent connections to keep.
      pool_size=10,
      # Temporarily exceeds the set pool_size if no connections are available.
      max_overflow=2,
      # The total number of concurrent connections for your application will be
      # a total of pool_size and max_overflow.
      # 'pool_timeout' is the maximum number of seconds to wait when retrieving a
      # new connection from the pool. After the specified amount of time, an
      # exception will be thrown.
      pool_timeout=30,  # 30 seconds
      # 'pool_recycle' is the maximum number of seconds a connection can persist.
      # Connections that live longer than the specified amount of time will be
      # re-established
      pool_recycle=1800,  # 30 minutes
  )

# ... inside your endpoint code

global pool
if pool is None:
  pool = init_connection_pool("YOUR-USER", "YOUR-PASsWORD", "CLOUD-SQL-PRIVATE-IP", "YOUR-DB")

# acquire connection from pool
with pool.connect() as db_conn:
  # ...

I have just set some defaults for configuring the pool and they can easily be tweaked as you see fit.