Search code examples
pythonsqlpyodbc

Does connecting to a different SQL database, require a new pyodbc connection?


I am trying to write a function that will search for a value in an SQL table and return the table name if the value is found. Additionally, I have it setup that if the user leaves the table name blank, it will search in all tables associated with the specified database. However, I would also like to have it setup where if the database name is left blank, it will also iterate through each database on the server, and then through each table of the respective databases.

Problem: Ideally, rather than creating a new connection to the server and database, I was hoping there would be a way to use the existing connection to the server.

import pandas
import pyodbc

server_name = 'SomeServer'
db_name = 'master' 
col_value = 'FindThisValue'

with pyodbc.connect("DRIVER={SQL Server};" +
                      f"SERVER={server_name};" +
                      f"DATABASE={db_name};" +
                      "Trusted_Connection=yes;") as main_conn:
        print('Connection established!')

dbs = pandas.read_sql('SELECT name FROM sys.databases', con=main_conn).name.to_list()

Without having to then do something like this, where it could potentially be hundreds of new connections:

for db in dbs:
    with pyodbc.connect("DRIVER={SQL Server};" +
                          f"SERVER={server_name};" +
                          f"DATABASE={db};" +
                          "Trusted_Connection=yes;") as main_conn:
    print('Connection established!')
    db_tables = [_.table_name for _ in main_conn.cursor().tables()]
    
    for table in db_tables:
        cols_query = f"SELECT name FROM sys.columns WHERE object_id = OBJECT_ID('{table}')"
        cols = pandas.read_sql_query(cols_query, con=main_conn).name.to_list()

        for col in cols:
            query = f"SELECT {col} FROM {table} WHERE {col} = '{col_value}'"
            # Technically there would be more logic here since 
            # this query could fail for a number of reasons, so
            # it's simplified to just this single query
            response = pandas.read_sql_query(query, con=main_conn)

I'd rather use the initial connection to the master and get the database and its corresponding columns.


Solution

  • Here's an option that doesn't require you to keep creating new connections, broken up into a few pieces:

    import pandas
    import pyodbc
    
    def establish_conn(server_name, db_name=None):
        print('Establishing connection...')
        if db_name is None:
            db_name = 'master'
            
        with pyodbc.connect("DRIVER={SQL Server};" +
                          f"SERVER={server_name};" +
                          f"DATABASE={db_name};" +
                          "Trusted_Connection=yes;") as main_conn:
            print('Connection established!')
            return main_conn
    
    def get_dbs(con):
        # With just the server name, this will retrieve all databases on a server
        dbs = pandas.read_sql('SELECT name FROM sys.databases', con=con).name.to_list()
        return dbs
    
    # create a connection
    conn = establish_conn('ServerName')
    
    # get all dbs on server
    dbs = get_dbs(conn)
    
    # Now get all tables from a database
    for db in dbs:
        tableData = pandas.read_sql(f"SELECT TABLE_NAME FROM {db}.INFORMATION_SCHEMA.tables", con=conn)