Search code examples
sql-serverrdataframerodbcbigdata

What is the fastest way to load huge .Rdata files (R dataframes) into SQL Server?


I am currently using library(RODBC) via:

dbhandle <- odbcDriverConnect('driver={SQL Server};server=xxxxxx;database=yyyy;trusted_connection=true')
sqlSave(dbhandle, R_table ,tablename = "SQL_table" )

The problem with this one is, it takes about 5 hours to load 10 Million rows into SQL Server.

Also it breaks while loading sometimes. So, I break the .Rdata into smaller chunks and sequentially load it into SQL Server.

Is there any quicker way to achieve this ?


Solution

  • I would export my data in CSV format and use BULK INSERT. If you really want (or have) to use sqlSave, then, the only options that come to my mind are:

    1. be sure to set fast=TRUE in sqlSave
    2. set auto commit off before running sqlSave: odbcSetAutoCommit(dbhandle, autoCommit = FALSE) and commit after sqlSave odbcEndTran(dbhandle, commit = TRUE)

    But, in my opinion, the real solution is BULK INSERT.