Search code examples
mysqlrrodbcrmysql

Is there a faster way to upload data from R to MySql?


I am using the following code to upload a new table into a mysql database.

library(RMySql)
library(RODBC)

con <- dbConnect(MySQL(),
  user = 'user',
  password = 'pw',
  host = 'amazonaws.com',
  dbname = 'db_name')

dbSendQuery(con, "CREATE TABLE table_1 (
        var_1 VARCHAR(50),
        var_2 VARCHAR(50),
        var_3 DOUBLE,
        var_4 DOUBLE);
        ")

channel <- odbcConnect("db name")
sqlSave(channel, dat = df, tablename = "tb_name", rownames = FALSE, append = 
TRUE)

The full data set is 68 variables and 5 million rows. It is taking over 90 minutes to upload 50 thousand rows to MySql. Is there a more efficient way to upload the data to MySql. I originally tried dbWriteTable() but this would result in an error message saying the connection to the database was lost.


Solution

  • Consider a CSV export from R for an import into MySQL with LOAD DATA INFILE:

    ...
    write.csv(df, "/path/to/filename.csv", row.names=FALSE)
    
    dbSendQuery(con, "LOAD DATA LOCAL INFILE '/path/to/filename.csv'
                      INTO TABLE mytable
                      FIELDS TERMINATED by ','
                      ENCLOSED BY '"'
                      LINES TERMINATED BY '\\n'")