Search code examples
python-3.xpandasamazon-web-servicessqlalchemyamazon-redshift

Python Sqlalchemy insert data into AWS Redshift


I want to load a large excel table data into AWS Redshift, using Python psycopg2 take a long time to load, so I try to use Sqlalchemy. but the redshift-sqlalchemy documentation is confusing. So I want to use the regular Sqlalchemy library. Below code can pull data from AWS redshift, I don't know how to modify it to INSERT data into redshift. if possible, I like to INSERT data at once.

import pandas as pd
import psycopg2
from sqlalchemy import create_engine
from sqlalchemy import text

sql = """
SELECT top 10 * FROM pg_user;
""" 

redshift_endpoint1 = "YourDBname.cksrxes2iuiu.us-east-1.redshift.amazonaws.com"
redshift_user1 = "YourUserName"
redshift_pass1 = "YourRedshiftPassword"
port1 = 8192 #whaterver your Redshift portnumber is
dbname1 = "YourDBname"

from sqlalchemy import create_engine
from sqlalchemy import text
engine_string = "postgresql+psycopg2://%s:%s@%s:%d/%s" \
% (redshift_user1, redshift_pass1, redshift_endpoint1, port1, dbname1)
engine1 = create_engine(engine_string)

df1 = pd.read_sql_query(text(sql), engine1)

Solution

  • df = pd.DataFrame({ 'id':['444'],'id2':[555]})
    df.to_sql('YourTable', con=engine1,if_exists='append',index= False)