Search code examples
pythonmysqlpandasxlsx

How to export a query to excel using Python?


I have been trying to loop through a list as a parameter for a query from database, and convert it into xlsx format, using pyodbc, pandas, xlsxwriter modules.

However, the message below keeps on appearing despite a process of trial and error: The first argument to execute must be a string or unicode query.

Could this have something to do with the query itself or the module 'pandas'?

Thank you.

This is for exporting a query result to an excel spreadsheet using pandas and pyodbc, with python 3.7 ver.

import pyodbc
import pandas as pd 

#Database Connection
conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
                      user='xxxxxx', password='xxxxxxxx')

cursor = conn.cursor()

depts = ['Human Resources','Accounting','Marketing']

query = """
                           SELECT *
                           FROM device ID 
                           WHERE
                           Department like ?
                           AND
                           Status like 'Active'
                           """

target = r'O:\\Example'

today = target + os.sep + time.strftime('%Y%m%d')

if not os.path.exists(today):
    os.mkdir(today)

for i in departments:
    cursor.execute(query, i)
    #workbook = Workbook(today + os.sep + i + 'xlsx')
    #worksheet = workbook.add_worksheet()
    data = cursor.fetchall()
    P_data = pd.read_sql(data, conn)
    P_data.to_excel(today + os.sep + i + 'xlsx')

Solution

  • When you read data into a dataframe using pandas.read_sql(), pandas expects the first argument to be a query to execute (in string format), not the results from the query.

    Instead of your line:

    P_data = pd.read_sql(data, conn)
    

    You'd want to use:

    P_data = pd.read_sql(query, conn)
    

    And to filter out the departments, you'd want to serialize the list into SQL syntax string:

    depts = ['Human Resources','Accounting','Marketing']
    
    # gives you the string to use in your sql query:
    depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
    

    To use the new SQL string in your query, use str.format:

    query = """
                               SELECT *
                               FROM device ID 
                               WHERE
                               Department in {query_vals}
                               AND
                               Status like 'Active'
                               """.format(query_vals=depts_query_string)
    

    All together now:

    import pyodbc
    import pandas as pd 
    
    #Database Connection
    conn = pyodbc.connect(driver='xxxxxx', server='xxxxxxx', database='xxxxxx',
                          user='xxxxxx', password='xxxxxxxx')
    
    cursor = conn.cursor()
    
    depts = ['Human Resources','Accounting','Marketing']
    
    # gives you the string to use in your sql query:
    depts_query_string = "('{query_vals}')".format(query_vals="','".join(depts))
    
    query = """
                                SELECT *
                                FROM device ID 
                                WHERE
                                Department in {query_vals}
                                AND
                                Status like 'Active'
                                """.format(query_vals=depts_query_string)
    
    target = r'O:\\Example'
    
    today = target + os.sep + time.strftime('%Y%m%d')
    
    if not os.path.exists(today):
        os.mkdir(today)
    
    for i in departments:
        #workbook = Workbook(today + os.sep + i + 'xlsx')
        #worksheet = workbook.add_worksheet()
        P_data = pd.read_sql(query, conn)
        P_data.to_excel(today + os.sep + i + 'xlsx')