Search code examples
pythonpandassqlalchemypymysql

How to send Excel data to MySQL using pandas and PyMySQL?


I'm having issues importing data with python into a table on my Database directly from an excel file.

I have this code:

import os
import pandas as pd
import pymysql

if os.path.exists("env.py"):
    import env
print(os.environ)


# Abre conexion con la base de datos
db = pymysql.connect(
    host = os.environ.get("MY_DATABASE_HOST"),
    user = os.environ.get("MY_USERNAME"),
    password = os.environ.get("MY_PASSWORD"),
    database = os.environ.get("MY_DATABASE_NAME")
)
##################################################

################LECTURA DE EXCEL##################
tabla_azul = "./excelFiles/tablaAzul.xlsx"
dAzul = pd.read_excel(tabla_azul, sheet_name='Órdenes')
dAzul.to_sql(con=db, name='tablaazul', if_exists='append', schema='str')
#print(type(dAzul))

tabla_verde = "./excelFiles/tablaVerde.xlsx"
dVerde = pd.read_excel(tabla_verde, sheet_name='Órdenes')
dVerde.to_sql(con=db, name='tablaverde', if_exists='append', schema='str')
  1. I'm not sure what table name I have to put into the name variable.
  2. Do I need to use sqlalchemy yes or yes?
  3. If question 2 is yes: Is it possible to connect sqlalchemy with pymysql?
  4. If question 3 is no: Ho do I use the .env variables like host with sqlalchemy connection?

thank you!

when I run the code above, it's giving me this error:

pandas.io.sql.DatabaseError: Execution failed on sql 'SELECT name FROM sqlite_master WHERE type='table' AND name=?;': not all arguments converted during string formatting

Solution

  • As stated in the pandas documentation, for any database other than SQLite .to_sql() requires a SQLAlchemy Connectable object, which is either an Engine object or a Connection object. You can create an Engine object for PyMySQL like so:

    import sqlalchemy as sa
    
    connection_url = sa.engine.URL.create(
        "mysql+pymysql",
        username=os.environ.get("MY_USERNAME"),
        password=os.environ.get("MY_PASSWORD"),
        host=os.environ.get("MY_DATABASE_HOST"),
        database=os.environ.get("MY_DATABASE_NAME")
    )
    engine = sa.create_engine(connection_url)
    

    Then you can call .to_sql() and pass it the engine:

    dVerde.to_sql(con=engine, name='tablaverde', if_exists='append', schema='str')