Search code examples
pythonsqlpostgresqlmodulepsycopg2

Simplify database (psycopg2) usage by creating a module


Let me preface this by saying that I am fairly new to Python and I apologize if this is not the appropriate place for this question.

I am using the psycopg2 module to manipulate a PostgreSQL database. The general usage would look something like this:

# Example 1

import psycopg2

conn = psycopg2.connect(database="mydb", user="postgres")
cur = conn.cursor()

cur.execute ("SELECT * FROM mytable;")
rows = cur.fetchall()
for i, row in enumerate(rows):
    print "Row", i, "value = ", row

cur.close()
conn.close()

This will open a connection to the mydb database, select all fields from the table mytable and print them, and then close the connection.

What I would like to do is factor out some of these functions into a module because I will need to call them over and over throughout many scripts. For this example, ideally I would have a module named core which contains three functions:

  • core.db_init() - which opens the connection and consists of the first two lines of code above.
  • core.db_query(query) - which performs the desired SQL query and consists of the third line in the above code.
  • core.db_close() - which closes the connection and consists of the last two lines.

I have tried creating a module as follows:

# Module core.py

import psycopg2

def db_init():
    conn = psycopg2.connect(database="mydb", user="postgres")
    cur = conn.cursor()

def db_query(query):
    cur.execute(query)

def db_close():
    cur.close()
    conn.close()

But I get namespace errors when I try to recreate Example 1 using this module:

# Example 2

import core

core.db_init()
core.db_query("SELECT * FROM mytable;")

rows = cur.fetchall()
for i, row in enumerate(rows):
    print "Row", i, "value = ", row    

core.db_close()

I'm not even sure a module is actually what I want. Should I be using a class instead? Again, I'm very new to all of this. But if someone could help me figure out a better way to do this, I would be very grateful.


Solution

  • Your main issue, is that each variable is limited to the function you wrote it in.
    Unless otherwise declared like such:

    def db_init():
        global conn
        conn = psycopg2....
    

    A better approach would be to convert this into a class, a basic example would be:

    import psycopg2
    
    class MyDatabase():
        def __init__(self, db="mydb", user="postgres"):
            self.conn = psycopg2.connect(dbname=db, user=user)
            self.cur = self.conn.cursor()
    
        def query(self, query):
            self.cur.execute(query)
    
        def close(self):
            self.cur.close()
            self.conn.close()
    
    db = MyDatabase()
    db.query("SELECT * FROM table;")
    db.close()
    

    Now, the SELECT query won't do much since you're using cur.execute().
    But i kept this on purpose to keep the code similar to what you wrote, you'll want to swap that out to return the values however if calling a query that is expected to return a value and so on.

    Your approach that is focused on functions will have "namespace" issues where variables live in a local scope of that function and there for other functions can't normally access them.

    Instead, class scoped variables can access its own variables and is there for not as limited out of the box.

    You could make global variables and declare them as global in the functions, but I think as I mentioned in a comment:

    You'd want to make this into a class. A database is a session based entity just as classes are session entities. Handle each connection as a living entity by class-abstracting it, otherwise cur and conn will become scoped variables and you need to work them into the global scope.