I have been trying to loop through a list as a parameter for a query from database, and convert it into xlsx format, using pyodbc, pandas, xlsxwriter modules.
However, the message below keeps on appearing despite a process of trial and error: The first argument to execute must be a string or unicode query.
Could this have something to do with the query itself or the module 'pandas'?
Thank you.
This is for exporting a query result to an excel spreadsheet using pandas and pyodbc, with python 3.7 ver.
import pyodbc
import pandas as pd
#Database Connection
conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
user='xxxxxx', password='xxxxxxxx')
cursor = conn.cursor()
depts = ['Human Resources','Accounting','Marketing']
query = """
SELECT *
FROM device ID
WHERE
Department like ?
AND
Status like 'Active'
"""
target = r'O:\\Example'
today = target + os.sep + time.strftime('%Y%m%d')
if not os.path.exists(today):
os.mkdir(today)
for i in departments:
cursor.execute(query, i)
#workbook = Workbook(today + os.sep + i + 'xlsx')
#worksheet = workbook.add_worksheet()
data = cursor.fetchall()
P_data = pd.read_sql(data, conn)
P_data.to_excel(today + os.sep + i + 'xlsx')
When you read data into a dataframe using pandas.read_sql(), pandas expects the first argument to be a query to execute (in string format), not the results from the query.
Instead of your line:
P_data = pd.read_sql(data, conn)
You'd want to use:
P_data = pd.read_sql(query, conn)
And to filter out the departments, you'd want to serialize the list into SQL syntax string:
depts = ['Human Resources','Accounting','Marketing']
# gives you the string to use in your sql query:
depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
To use the new SQL string in your query, use str.format:
query = """
SELECT *
FROM device ID
WHERE
Department in {query_vals}
AND
Status like 'Active'
""".format(query_vals=depts_query_string)
All together now:
import pyodbc
import pandas as pd
#Database Connection
conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
user='xxxxxx', password='xxxxxxxx')
cursor = conn.cursor()
depts = ['Human Resources','Accounting','Marketing']
# gives you the string to use in your sql query:
depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
query = """
SELECT *
FROM device ID
WHERE
Department in {query_vals}
AND
Status like 'Active'
""".format(query_vals=depts_query_string)
target = r'O:\\Example'
today = target + os.sep + time.strftime('%Y%m%d')
if not os.path.exists(today):
os.mkdir(today)
for i in departments:
#workbook = Workbook(today + os.sep + i + 'xlsx')
#worksheet = workbook.add_worksheet()
P_data = pd.read_sql(query, conn)
P_data.to_excel(today + os.sep + i + 'xlsx')