Search code examples
javamysqlrrcaller

How take database table from java to R using Rcaller?


I am currently doing an analytics regression based on mysql database using rcaller. Now I'm stuck on how to take databse table from java to R. This is what I have tried.

Class.forName("com.mysql.jdbc.Driver");
            conn = DriverManager.getConnection(DB_URL,USER,PASS);
            stmt = conn.createStatement();
            String sql;
            sql = "SELECT bf,ibt,rate FROM testing";
            ResultSet rs = stmt.executeQuery(sql);
            while(rs.next()){
               float bf  = rs.getFloat("bf");

            }           
            RCaller caller = new RCaller();
            RCode code = new RCode();  
            caller.setRscriptExecutable("C:/Program Files/R/R-2.15.0/bin/Rscript.exe");
            code.clear();
            caller.setRCode(code);

            code.R_require("rpart");          
            code.addRCode("ad.apprentissage= rpart(rate~, data=rs,cp=0.1)"); 
            code.addRCode("predArbreDecision=predict(ad.apprentissage,newdata=rs,type='class') "); 

            File file = code.startPlot();
            code.addRCode("plot(ad.apprentissage)"); 
            caller.runOnly();
            ImageIcon ii = code.getPlot(file);
            code.showPlot(file);

But this line below seems not working :

  code.addRCode("ad.apprentissage= rpart(rate~, data=rs,cp=0.1)"); 

I have runned this program with no error but with empty output.


Solution

  • You can pass the data from Java to R using data.frames using the 3.0 version of RCaller. Version 3.0 has the a minimum support for data.frame objects and it can be used like:

    Object[][] objects = new Object[][]{{1,2,3}, {"a", "b", "c"}};
    String[] names = new String[] {"numbers", "letters"};
    DataFrame dataFrame = DataFrame.create(objects, names);
    

    and the method addDataFrame() in class RCode can be used for transferring data from Java to R as:

    RCode rCode = RCode.create();
    rCode.addDataFrame("df", dataFrame);
    

    and the slots of this data.frame object are accessable in R. For example

    rCode.addRCode("mymean <- mean(df$numbers)");
    

    creates a variable for mean of numbers in df. Since RCaller passes data frame objects in an efficient way, the cost of transferring data is not big especially for data frames.

    In addition to this, you can create an R file that directly connects to database engines and perform sql queries in R side. For example, the package RMySQL is a good option if your data is stored in a MySQL database.