Search code examples
pythonsqlsnowflake-cloud-data-platformsnowflake-connector

Pythonic way to write OR SQL condition to extract information schema data from Snowflake


I am connecting to Snowflake using python (Below code). As you can see I am trying to get Row Count,Date table was created and altered from Information Schema that is present in SNOWFLAKE_SAMPLE_DATA database for schema 'TPCDS_SF100TCL'. My question is instead of writing "OR" condition for each and every table (in below example: CUSTOMER,CALL_CENTER,CUSTOMER_ADDRESS), I was wondering if I can just provide a list somewhere and iterate through that list for "OR" condition. I am not sure if this is possible ? If it is please suggest the code with explanation.

Thanks in advance for your time and suggestion!

Python Code

    import pandas as pd
    import snowflake.connector
    
    conn = snowflake.connector.connect(
                user="MY_USER",
                password="MY_PSWD",
                account="MY_ACCOUNT",
                warehouse="COMPUTE_WH",
                database="SNOWFLAKE_SAMPLE_DATA",
                schema="INFORMATION_SCHEMA",
                role="SYSADMIN"
                )
    
    cur = conn.cursor()
    
    try:
        cur.execute("SELECT TABLE_NAME,ROW_COUNT,CREATED,LAST_ALTERED  FROM TABLES WHERE TABLE_TYPE='BASE TABLE' 
AND TABLE_SCHEMA='TPCDS_SF100TCL' AND TABLE_NAME='CUSTOMER' OR TABLE_NAME='CALL_CENTER' 
OR TABLE_NAME='CUSTOMER_ADDRESS'")
        df = cur.fetch_pandas_all()
    finally:
        cur.close()
    conn.close()

Code after suggestion by BeRT2me [UPDATE: Both approaches provided by SO user works. See below his suggested code]

cur = conn.cursor()


tables = ['CUSTOMER', 'CALL_CENTER', 'CUSTOMER_ADDRESS']
try:

    cur.execute(f"""SELECT TABLE_NAME, ROW_COUNT, CREATED, LAST_ALTERED FROM TABLES WHERE TABLE_TYPE='BASE TABLE' AND TABLE_SCHEMA='TPCDS_SF100TCL' AND TABLE_NAME IN ({','.join(tables)})""")
    df = cur.fetch_pandas_all()
finally:
    cur.close()
conn.close()

Solution

  • If just simplifying the query is good enough, you can start with:

    cur.execute(
        """
        SELECT TABLE_NAME, ROW_COUNT, CREATED, LAST_ALTERED 
        FROM TABLES 
        WHERE TABLE_TYPE='BASE TABLE' 
        AND TABLE_SCHEMA='TPCDS_SF100TCL' 
        AND TABLE_NAME IN ('CUSTOMER', 'CALL_CENTER', 'CUSTOMER_ADDRESS')
        """
    )
    

    If you want a pythonic way of injecting a list, this should work:

    tables = ['CUSTOMER', 'CALL_CENTER', 'CUSTOMER_ADDRESS']
    
    cur.execute(
        f"""
        SELECT TABLE_NAME, ROW_COUNT, CREATED, LAST_ALTERED 
        FROM TABLES 
        WHERE TABLE_TYPE='BASE TABLE' 
        AND TABLE_SCHEMA='TPCDS_SF100TCL' 
        AND TABLE_NAME IN ({','.join("'" + x + "'" for x in tables)})
        """
    )