Search code examples
python-3.xpostgresqlpsycopg2

Is it possible to persist psycopg2 connection more than 24 hour. I tried but I could not execute successfully


I have created database connection as below code. My question is basically is it possible to persist connection more than 24 hours. What my problem is current that if this code is running into application for more than 24 hours and if I don't do any insert or update on SQL from other process. Later after 24 hours even if there is new records in tables from other process. This code below return me before 24 hours value, which it executed last 24 hours back.

import pandas as pd
import psycopg2

class Database:

    def __init__(self, DATABASE_HOST, DATABASE_USERNAME, DATABASE_PASSWORD,
                 DATABASE_PORT, DATABASE_NAME):
        self.host = DATABASE_HOST
        self.username = DATABASE_USERNAME
        self.password = DATABASE_PASSWORD
        self.port = DATABASE_PORT
        self.dbname = DATABASE_NAME
        self.conn = None

    def connect(self):
        if self.conn is None:
            self.conn = psycopg2.connect(
            host=self.host,
            user=self.username,
            password=self.password,
            port=self.port,
            dbname=self.dbname)

    def rows_to_frame(self, query):
        self.connect()
        return pd.read_sql(query, self.conn)

    def fetchall(self, query):
        self.connect()
        with self.conn.cursor() as cur:
            cur.execute(query)
            records = cur.fetchall()
        cur.close()
        return records

    def fetchone(self, query):
        self.connect()
        with self.conn.cursor() as cur:
            cur.execute(query)
            records = cur.fetchone()
        cur.close()
        return records

The folder structure is as follows

Application
    --> pgconnection
        | --> __init__.py
        | --> db.py
    --> Source
        | --> main.py

For example:

__init__.py

from pgconnection import db

print(db.execute('select * from emp')

case 1: record in emp table before 24 hours -

---------------------------
id  | name   | salary      |
---------------------------
1   | Sam    | 6000        |
---------------------------
2   | Adam   | 5000        |
----------------------------

and result of above code:

print(db.rows_to_frame('select * from '))

---------------------------
id  | name   | salary      |
---------------------------
1   | Sam    | 6000        |
---------------------------
2   | Adam   | 5000        |
----------------------------

case 2:

After 24 hours new data added into emp table from other process. If I execute the query again since object is created

---------------------------
id  | name   | salary      |
---------------------------
1   | Sam    | 6000        |
---------------------------
2   | Adam   | 5000        |
---------------------------
3   | Ada    | 5600        |
---------------------------
4   | Stephen | 7899       |
---------------------------

On running main.py it is still returning result as

---------------------------
id  | name   | salary      |
---------------------------
1   | Sam    | 6000        |
---------------------------
2   | Adam   | 5000        |
----------------------------

instead of

---------------------------
id  | name   | salary      |
---------------------------
1   | Sam    | 6000        |
---------------------------
2   | Adam   | 5000        |
---------------------------
3   | Ada    | 5600        |
---------------------------
4   | Stephen | 7899       |
---------------------------

what could be the reason? I am not able to solve the issue. Please someone guide me where I am making mistake your input is valuable.


Solution

  • As per my research and exploration on database connection and persisting connection. It is actually not recommended to open connection for long duration, instead the session should be closed as soon as read and write operation is completed else the connection error you might get as -

    to many client error
    

    and if you are trying to work with python OOPs for database connection then in that case if you are not closing the connection by default if client is ideal for long then you connection will be either auto closed else it will cache in python and even if you try calling any methods it will return you cache results.

    So, the best you can do for less number of iteration to database connection is divide your data into partitions and for each partition create connection and process the operation and then successfully close the connection. This you can repeat with other partitions too.