I have a project I received at work trying to supersede part numbers from a bill of material in an excel file. I have a query that I wrote to run the parts through a self-join that would essentially supersede the parts through. I can manually manipulate the excel file and make the query return what I need through snowflake. We'll likely get more requests of this type if we can successfully turn this into an automation. It would save SO much time in manual labor. Once I can figure this snowflake connector issue, I want to load the result into a new column in the excel spreadsheet.
I have successfully extracted the column with the list of part numbers from the excel file, authenticated to snowflake using sqlalchemy URL. Now I'm trying to apply my SQL query and I'm running into a roadblock. I've gone through the SQLAlchemy and Snowflake Python Connector documentation and I just can't seem to quite get it right.
Even after adding:
conn.cursor().execute("USE WAREHOUSE XS_WAREHOUSE")
conn.cursor().execute("USE DATABASE DB")
conn.cursor().execute("USE SCHEMA db_mg.schemado you")"""
Here is the current code I am working with:
import pandas as pd
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
"""Extract the list of part numbers from the Part Number column and input it into a pandas
dataframe"""
df = pd.read_excel(r'C:/Users/[MyUsername]/Downloads/EAS Working Candidates.xlsx')
part_num_col = df['PartNumber'].tolist()
"""SQL Query for Parts List built from .XLSX Part File.
Part supersession is done through AS400 manually. The Query loops the part numbers back
through the query until a null is returned. Once a null is returned, the part number from the
previous loop is then placed into a column 'final_actual'."""
query = r"SELECT r.itmid, rg.cstsku as final_part, i.item_desc as final_part_desc,
CASE WHEN rg.cstsku is not null THEN rg.cstsku
WHEN rf.cstsku is not null THEN rf.cstsku
WHEN re.cstsku is not null THEN re.cstsku
WHEN rd.cstsku is not null THEN rd.cstsku
WHEN rc.cstsku is not null THEN rc.cstsku
WHEN rb.cstsku is not null THEN rb.cstsku
WHEN ra.cstsku is not null THEN ra.cstsku
WHEN r.cstsku is not null THEN r.cstsku
ELSE r.itmid END as final_actual
FROM DB.AS400.VC_DOPCIXREF r
LEFT JOIN DB.AS400.VC_DOPCIXREF ra
ON r.cstsku = ra.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF rb
ON ra.cstsku = rb.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF rc
ON rb.cstsku = rc.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF rd
ON rc.cstsku = rd.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF re
ON rd.cstsku = re.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF rf
ON re.cstsku = rf.itmid
LEFT JOIN DB.AS400.VC_DOPCIXREF rg
ON rf.cstsku = rg.itmid
LEFT JOIN DB.AS400.VC_ITEM i
ON rg.cstsku = i.item_id WHERE r.cstsku IN (" + "'" + "','".join(map(str, part_num_col)) + "'" + ");"
#Connect to the Snowflake Data Warehouse
engine = create_engine(URL(
user='[MyUsername]@domain.com',
account='account',
role='ANALYST',
authenticator='EXTERNALBROWSER',
warehouse='DB',
database='AS400',
))
connection = engine.connect()
cur = connection.cursor()
try:
cur.execute(query)
df = pd.read_sql(query, engine)
finally:
connection.close()
engine.dispose()
I receive the following error:
Traceback (most recent call last):
File "C:\Users\[MyUsername]\PycharmProjects\PTC EAS Report\PTC EAS Part List.py", line 24, in <br /> <module>
cur = Connection.Cursor()
AttributeError: 'Connection' object has no attribute 'Cursor'
Thanks in Advance for taking a look! (Also, if I can present this in a better, more readable way next time please let me know! I'm still learning!)
There is no cursor attribute associated with the connection object in the sqlalchemy snowflake connector.
The correct usage is as below
from snowflake.sqlalchemy import URL
from sqlalchemy import create_engine
engine = create_engine(URL(
account = 'myorganization-myaccount',
user = 'testuser1',
password = '0123456',
database = 'testdb',
schema = 'public',
warehouse = 'testwh',
role='myrole',
))
try:
connection = engine.connect()
results = connection.execute('select
current_version()').fetchone()
print(results[0])
or
rows = r.fetchall()
results = connection.execute('select
current_version()').fetchall()
print(rows)
finally:
connection.close()
engine.dispose()
Regards, Sujan