python pandas dataframe sqlalchemy snowflake-cloud-data-platform

Optimal way to store data from Pandas to Snowflake

The dataframe is huge (7-8 million rows). Tried to_sql with chunksize = 5000 but it never finished.

Using,

from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL

df.to_sql(snowflake_table , engine, if_exists='replace', index=False, index_label=None, chunksize=20000)

What are other optimal solutions for storing data into SF from Pandas DF? Or what am I doing wrong here? The DF is usually of size 7-10 million rows.

Solution

The optimal way that ilja-everila pointed out is “copy into...” as SF required the csv to be staged on cloud before transformation I was hesitant to do it but it seems like that is the only option given that the performance is in 5-10 minutes for 6.5million records.