I built a simple class with a couple methods to make my life a little easier when loading data into Postgres with Python. I also attempted to package it so I could pip install it (just to experiment, never done that before).
import psycopg2
from sqlalchemy import create_engine
import io
class py_psql:
engine = None
def engine(self, username, password, hostname, port, database):
connection = 'postgresql+psycopg2://{}:{}@{}:{}/{}'.format(ntid.lower(), pw, hostname, port, database)
self.engine = create_engine(connection)
def query(self, query):
pg_eng = self.engine
return pd.read_sql_query(query, pg_eng)
def write(self, write_name, df, if_exists='replace', index=False):
mem_size = df.memory_usage().sum()/1024**2
pg_eng = self.engine
def write_data():
df.head(0).to_sql(write_name, pg_eng, if_exists=if_exists,index=index)
conn = pg_eng.raw_connection()
cur = conn.cursor()
output = io.StringIO()
df.to_csv(output, sep='\t', header=False, index=False)
output.seek(0)
contents = output.getvalue()
cur.copy_from(output, write_name, null="")
conn.commit()
if mem_size > 100:
validate_size = input('DataFrame is {}mb, proceed anyway? (y/n): '.format(mem_size))
if validate_size == 'y':
write_data()
else:
print("Canceling write to database")
else:
write_data()
My package directory looks like this:
py_psql
py_psql.py
__init__.py
setup.py
My init.py is empty since I read elsewhere that I was able to do that. I'm not remotely an expert here...
I was able to pip install that package and import it, and if I were to paste this class into a python shell, I would be able to do something like
test = py_psql()
test.engine(ntid, pw, hostname, port, database)
and have it create the sqlalchemy engine. However, when I import it after the pip install I can't even initialize a py_psql object:
>>> test = py_psql()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'module' object is not callable
>>> py_psql.engine(ntid, pw, hostname, port, database)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: module 'py_psql' has no attribute 'engine'
I'm sure I'm messing up something obvious here, but I found the process of packaging fairly confusing while researching this. What am I doing incorrectly?
Are you sure you imported your package correctly after pip install?
For example:
from py_psql.py_psql import py_psql
test = py_psql()
test.engine(ntid, pw, hostname, port, database)