Search code examples
mysqlrdatasetadditiondeducer

How to add a dataframe to a mysql table using RODBC


An absolute beginner at R so please excuse any obvious questions.

I am trying to update a MySQL table ('exdata3' in database' test') with the dataframe from Deducer.

I had initially called the database and used the table in deducer, run some queries on it which is stored in the dataframe 'data1'. Now I want to send 'data1' which contains a new column and update the non-empty table in MySQL, exdata3.

(In future I might not have new columns/rows but the existing columns might have different values in the dataframe as compared to the same columns in MySQL, if you know what I mean )

The difference between the dataframe and the table in MySQL is the addition of a column 'DiffHighLow' (which contains the values calculated from two other coulmns in the table) in the dataframe.

I am using RODBC to communicate between R and MySQL on 64 bit Windows7

I have tried the following with the subsequent errors.

1. > sqlSave(channel, data1, exdata3, append= TRUE,rownames = "DiffHighLow", colnames = TRUE, verbose = FALSE, safer = TRUE, addPK = FALSE, "integer", fast = TRUE, test = FALSE, nastring = NULL)

Error in sqlSave(channel, data1, exdata3, append = TRUE, rownames = "DiffHighLow", : object 'exdata3' not found

2.> sqlSave(channel, data1, rownames = "DiffHighLow", addPK = TRUE)

Error in sqlSave(channel, data1, rownames = "DiffHighLow", addPK = TRUE) : 42S21 1060 [MySQL][ODBC 5.3(w) Driver][mysqld-5.0.41-community-nt]Duplicate column name 'diffhighlow'

[RODBC] ERROR: Could not SQLExecDirect 'CREATE TABLE data1 (DiffHighLow varchar(255) NOT NULL PRIMARY KEY, exdata3_id integer, date double, open double, high double, low double, close double, shares_ integer, turnover_ double, diffhighlow double)'

3rd. > sqlUpdate(channel,data1, "exdata3")

Error in sqlUpdate(channel, data1, "exdata3") : data frame column(s) date open high low close shares_ turnover_ diffhighlow not in database table

I don't understand, amongst other things, why its not recognizing the row names which already exist in the database table except for 'DiffHighLow'

Please advise. I have searched for such a question on SO a lot and tried the the various solutions already posted (that looked most alike to my query)but none of them seem to be working.


Solution

  • If I understand you correctly, you load the contents of a table, exdata3 into R from a MySQL database, modify it by among other things adding a new column, DiffHighLow, and now you want to save it back to the MySQL database. If that is what you are trying to do, this will work:

    # this just sets up the example
    set.seed(1)    # for reproducible example
    df <- data.frame(Open=rnorm(50), Low=rnorm(50), High=rnorm(50), Close=rnorm(50))
    library(RODBC)
    channel <- odbcConnect(dsn="MySQL", case="nochange")
    sqlSave(channel,df,"exdata3", rownames=F)
    
    data1 <- sqlQuery(channel,"select * from exdata3")
    data1 <- transform(data1,DiffHighLow=High-Low)
    
    # you start here
    sqlQuery(channel,"drop table exdata3")
    sqlSave(channel,data1,"exdata3")
    result <- sqlQuery(channel,"select * from exdata3")
    head(result)
    

    The basic problem is that sqlSave(...) requires either that the MySQL table not exist at all (in which case it will be created), or that it has columns which map exactly to the columns in your database. You can't add columns using sqlSave(...). So this code simply deletes the original version of exdata3 and then creates a new copy from the data in data1. The last line just reads the table back in to demonstrate that it worked.

    There are a couple of problems with your code.

    The first try fails because you pass a variable name exdata3, instead of a character string "exdata3". You don't have a variable named exdata3 in your R environment, hence the error: "object 'exdata3' not found".

    The second try fails because you are telling the ODBC driver to save the data frame's row names in a column called diffhighlow, but there is already a column in data1 with that name, so the driver is trying to create two columns with the same name.

    The third try fails because the columns in exdata3 don't map out to the columns in data1.