Search code examples
rodbchana

Copying dataframe from R studio to SAP HANA


I already extracted twitter data in a list and converted it into a data.frame with the necessary basic packages installation.

I used the following code to extract data from Twitter:

library(twitteR)
setup_twitter_oauth("consumer-key", "consumer-secret",
                    "access-token", "access-secret")

search.string <- "#Karnataka"
no.of.tweets <- 10

tweets <- searchTwitter(search.string, n=no.of.tweets, lang="en")

tweet_df <- twListToDF(tweets)

I want to load the data.frame tweet_df to a table in a HANA database within my schema. Can anyone please help me with this?


Solution

  • Generally, I'd recommend to keep the data in SAP HANA and operate on it from there, but you can of course also "save" R data frames to it.

    Here is a simple example:

    # load the ODBC driver
    library("RODBC")
    # open a ODBC "channel" - since I use the SAP HANA secure storage all I have to 
    # specify here is the name of the DSN entry. 
    ch<-odbcConnect("S12"
    
    # this is just the string for the name of the table the data will be inserted into
    table.for.save <- 'AIRQUALITY'
    # get the data frame for the AIRQUALITY sample data 
    aqdata <- airquality
    # that's a data frame
    str(aqdata)
    
    # now "save" the data frame to SAP HANA via sqlSave
    sqlSave(ch,dat = aqdata, tablename = table.for.save, verbose = TRUE, rownames =  FALSE)
    
    
    > sqlSave(ch,dat = aqdata, tablename = table.for.save, verbose = TRUE, rownames =  FALSE)
    
    Query: CREATE TABLE "AIRQUALITY"  ("Ozone" INTEGER, "SolarR" INTEGER, "Wind" DOUBLE, "Temp" INTEGER, "Month" INTEGER, "Day" INTEGER)
    Query: INSERT INTO "AIRQUALITY" ( "Ozone", "SolarR", "Wind", "Temp", "Month", "Day" ) VALUES ( ?,?,?,?,?,? )
    Binding: 'Ozone' DataType 4, ColSize 10
    Binding: 'SolarR' DataType 4, ColSize 10
    Binding: 'Wind' DataType 8, ColSize 15
    Binding: 'Temp' DataType 4, ColSize 10
    Binding: 'Month' DataType 4, ColSize 10
    Binding: 'Day' DataType 4, ColSize 10
    Parameters:
    no: 1: Ozone 41/***/no: 2: SolarR 190/***/no: 3: Wind 7.4/***/no: 4: Temp 67/***/no: 5: Month 5/***/no: 6: Day 1/***/
    no: 1: Ozone 36/***/no: 2: SolarR 118/***/no: 3: Wind 8/***/no: 4: Temp 72/***/no: 5: Month 5/***/no: 6: Day 2/***/
    no: 1: Ozone 12/***/no: 2: SolarR 149/***/no: 3: Wind 12.6/***/no: 4: Temp 74/***/no: 5: Month 5/***/no: 6: Day 3/***/
    ...
    

    In SAP HANA you can now SELECT from the table:

    select top 10 * from "DEVDUDE"."AIRQUALITY";
    
    Ozone   SolarR  Wind    Temp    Month   Day
    41      190     7.4     67      5       1  
    36      118     8.0     72      5       2  
    12      149     12.6    74      5       3  
    18      313     11.5    62      5       4  
    ?       ?       14.3    56      5       5  
    28      ?       14.9    66      5       6  
    23      299     8.6     65      5       7  
    19      99      13.8    59      5       8  
    8       19      20.1    61      5       9  
    ?       194     8.6     69      5       10 
    

    Since this is all standard R and ODBC and not specific to SAP HANA, I recommend to read up on those technologies.