Search code examples
pythonsqlteradatadotenv

teradatasql Python module only works when scripting but not when running code


I have run into a peculiar issue while using the teradatasql package (installed from pypi). I use the following code (let's call it pytera.py) to query a database:

from dotenv import load_dotenv
import pandas as pd
import teradatasql

# Load the database credentials from .env file
_ = load_dotenv()
db_host = os.getenv('db_host')
db_username = os.getenv('db_username')
db_password = os.getenv('db_password')


def run_query(query):
    """Run query string on teradata and return DataFrame."""
    if query.strip()[-1] != ';':
        query += ';'

    with teradatasql.connect(host=db_host, user=db_username,
                         password=db_password) as connect:
        df = pd.read_sql(query, connect)
    return df

When I import this function in the IPython/Python interpreter or in Jupyter Notebook, I can run queries just fine like so:

import pytera as pt

pt.run_query('select top 5 * from table_name;')

However, if I save the above code in a .py file and try to run it, I get an error message most of the time (not all the time). The error message is below.

E   teradatasql.OperationalError: [Version 16.20.0.49] [Session 0] [Teradata SQL Driver] Hostname lookup failed for None
E    at gosqldriver/teradatasql.(*teradataConnection).makeDriverError TeradataConnection.go:1046
E    at gosqldriver/teradatasql.(*Lookup).getAddresses CopDiscovery.go:65
E    at gosqldriver/teradatasql.discoverCops CopDiscovery.go:137
E    at gosqldriver/teradatasql.newTeradataConnection TeradataConnection.go:133
E    at gosqldriver/teradatasql.(*teradataDriver).Open TeradataDriver.go:32
E    at database/sql.dsnConnector.Connect sql.go:600
E    at database/sql.(*DB).conn sql.go:1103
E    at database/sql.(*DB).Conn sql.go:1619
E    at main.goCreateConnection goside.go:229
E    at main._cgoexpwrap_e6e101e164fa_goCreateConnection _cgo_gotypes.go:214
E    at runtime.call64 asm_amd64.s:574
E    at runtime.cgocallbackg1 cgocall.go:316
E    at runtime.cgocallbackg cgocall.go:194
E    at runtime.cgocallback_gofunc asm_amd64.s:826
E    at runtime.goexit asm_amd64.s:2361
E   Caused by lookup None on <ip address redacted>: server misbehaving

I am using Python 3.7.3 and teradatasql 16.20.0.49 on Ubuntu (WSL) 18.04.

Perhaps not coincidentally, I run into a similar issue when trying a similar workflow on Windows (using the teradata package and the Teradata Python drivers installed). Works when I connect inside the interpreter or in Jupyter, but not in a script. In the Windows case, the error is:

E teradata.api.DatabaseError: (10380, '[08001] [Teradata][ODBC] (10380) Unable to establish connection with data source. Missing settings: {[DBCName]}')

I have a feeling that there's something basic that I'm missing, but I can't find a solution to this anywhere.


Solution

  • Thanks ravioli for the fresh eyes. Turns out the issue was loading in the environment variables using dotenv. My module is in a Python package (separate folder), and my script and .env files are in the working directory.

    dotenv successfully reads in the environment variables (.env in my working directory) when I run the code in my original post, line by line, in the interpreter or in Jupyter. However, when I run the same code in a script, it does not find in the .env file in my working directory. That will be a separate question I'll have to find the answer to.

    import teradatasql
    import pandas as pd
    
    
    def run_query(query, db_host, db_username, db_password):
        """Run query string on teradata and return DataFrame."""
        if query.strip()[-1] != ';':
            query += ';'
    
        with teradatasql.connect(host=db_host, user=db_username,
                             password=db_password) as connect:
            df = pd.read_sql(query, connect)
        return df
    
    

    The code below runs fine in a script now:

    import pytera as pt
    from dotenv import load_dotenv()
    
    _ = load_dotenv()
    db_host = os.getenv('db_host')
    db_username = os.getenv('db_username')
    db_password = os.getenv('db_password')
    
    data = pt.run_query('select top 5 * from table_name;', db_host, db_username, db_password)