I have run into a peculiar issue while using the teradatasql package (installed from pypi). I use the following code (let's call it pytera.py) to query a database:
from dotenv import load_dotenv
import pandas as pd
import teradatasql
# Load the database credentials from .env file
_ = load_dotenv()
db_host = os.getenv('db_host')
db_username = os.getenv('db_username')
db_password = os.getenv('db_password')
def run_query(query):
"""Run query string on teradata and return DataFrame."""
if query.strip()[-1] != ';':
query += ';'
with teradatasql.connect(host=db_host, user=db_username,
password=db_password) as connect:
df = pd.read_sql(query, connect)
return df
When I import this function in the IPython/Python interpreter or in Jupyter Notebook, I can run queries just fine like so:
import pytera as pt
pt.run_query('select top 5 * from table_name;')
However, if I save the above code in a .py file and try to run it, I get an error message most of the time (not all the time). The error message is below.
E teradatasql.OperationalError: [Version 16.20.0.49] [Session 0] [Teradata SQL Driver] Hostname lookup failed for None
E at gosqldriver/teradatasql.(*teradataConnection).makeDriverError TeradataConnection.go:1046
E at gosqldriver/teradatasql.(*Lookup).getAddresses CopDiscovery.go:65
E at gosqldriver/teradatasql.discoverCops CopDiscovery.go:137
E at gosqldriver/teradatasql.newTeradataConnection TeradataConnection.go:133
E at gosqldriver/teradatasql.(*teradataDriver).Open TeradataDriver.go:32
E at database/sql.dsnConnector.Connect sql.go:600
E at database/sql.(*DB).conn sql.go:1103
E at database/sql.(*DB).Conn sql.go:1619
E at main.goCreateConnection goside.go:229
E at main._cgoexpwrap_e6e101e164fa_goCreateConnection _cgo_gotypes.go:214
E at runtime.call64 asm_amd64.s:574
E at runtime.cgocallbackg1 cgocall.go:316
E at runtime.cgocallbackg cgocall.go:194
E at runtime.cgocallback_gofunc asm_amd64.s:826
E at runtime.goexit asm_amd64.s:2361
E Caused by lookup None on <ip address redacted>: server misbehaving
I am using Python 3.7.3 and teradatasql 16.20.0.49 on Ubuntu (WSL) 18.04.
Perhaps not coincidentally, I run into a similar issue when trying a similar workflow on Windows (using the teradata package and the Teradata Python drivers installed). Works when I connect inside the interpreter or in Jupyter, but not in a script. In the Windows case, the error is:
E teradata.api.DatabaseError: (10380, '[08001] [Teradata][ODBC] (10380) Unable to establish connection with data source. Missing settings: {[DBCName]}')
I have a feeling that there's something basic that I'm missing, but I can't find a solution to this anywhere.
Thanks ravioli for the fresh eyes. Turns out the issue was loading in the environment variables using dotenv. My module is in a Python package (separate folder), and my script and .env files are in the working directory.
dotenv successfully reads in the environment variables (.env in my working directory) when I run the code in my original post, line by line, in the interpreter or in Jupyter. However, when I run the same code in a script, it does not find in the .env file in my working directory. That will be a separate question I'll have to find the answer to.
import teradatasql
import pandas as pd
def run_query(query, db_host, db_username, db_password):
"""Run query string on teradata and return DataFrame."""
if query.strip()[-1] != ';':
query += ';'
with teradatasql.connect(host=db_host, user=db_username,
password=db_password) as connect:
df = pd.read_sql(query, connect)
return df
The code below runs fine in a script now:
import pytera as pt
from dotenv import load_dotenv()
_ = load_dotenv()
db_host = os.getenv('db_host')
db_username = os.getenv('db_username')
db_password = os.getenv('db_password')
data = pt.run_query('select top 5 * from table_name;', db_host, db_username, db_password)