Search code examples
javaeclipsereclipse-juno

Java and R integration


I am trying to build a java project which contains R codes. Main logic behind that is I want to automate the data structuring and data analysis with in a same project. Partially I am being able to do that. I connected R to Java and my R codes are running well. I did all my set up in the local machine and its giving me all output as I need. As data set is big I am trying to run this on amazon server. But when I am shifting it to server, my project is not working properly. Its not being able to execute library(XLConnect), library(rJava). When ever I am calling this two libraries in my java project it's crashing. Independently in R codes are running and giving me output. What I can I for that, and how to fix thus error. Please help me out from this.


My java codes is

import java.io.InputStreamReader;
import java.io.Reader;


public class TestRMain {

    public static void main(String[] arg)throws Exception{

        ProcessBuilder broker = new ProcessBuilder("R.exe","--file=E:\\New\\Modified_Best_Config.R");
        Process runBroker = broker.start();

        Reader reader = new InputStreamReader(runBroker.getInputStream());
        int ch;
        while((ch = reader.read())!= -1)
            System.out.print((char)ch);
        reader.close();

        runBroker.waitFor();

        System.out.println("Execution complete");

    }
}

And in the Modified_Best_Config.R I have written these codes

library('ClustOfVar');
library("doBy");
library(XLConnect)
#library(rJava)
#library(xlsx)

path="E:/New/";


############Importing and reading the excel files into R##############

Automated_R <- loadWorkbook("E:/New/Option_Mix_Calculation1.xlsx")

sheet1 <- readWorksheet(Automated_R, sheet = "Current Output")
sheet2 <- readWorksheet(Automated_R, sheet = "Actual Sales monthly")
sheet3 <- readWorksheet(Automated_R, sheet = "Differences")


#####################Importing raw Data###############################

optionData<- read.csv(paste(path,"ModifiedStructureNewBestConfig1.csv",sep=""),head=TRUE,sep=",");


nrow(optionData)
optionDemand=sapply(split(optionData,optionData$Trim),trimSplit);
optionDemand1=t(optionDemand[c(-1,-2),]);
optionDemand1

################Calculating the equipment Demand####################

optionDemand2<-t(optionDemand2[c(-1,0)]);

Rownames <- as.data.frame(row.names(optionDemand2))

writeWorksheet(Automated_R,Rownames, sheet = "Current Output", startRow = 21, startCol = 1)
writeWorksheet(Automated_R,optionDemand2, sheet = "Current Output", startRow = 21, startCol = 2)
saveWorkbook(Automated_R)

But java is stopping its operation after these line.

    library("doBy");

Whole set of codes are running on nicely on my local machine. But whenever I am trying to run this on amazon server it's not running. Individually in R this code is running on server. I have couple of more R codes which are running with out any error. What can I do for that, please help me out.


Solution

  • Thanks for updating your question with some example code. I cannot completely replicate your circumstances because I presently don't have immediate access to Amazon EC2, and I don't know the specific type of instance you are using. But here a couple of suggestions for de-bugging your issue, which I have a hunch is being caused by a missing package.

    1. Try to install the offending packages via your R script

    At the very beginning of your R script, before you try to load any packages, insert the following:

    install.packages(c("XLConnect", "rJava"))
    

    If your instance includes a specified CRAN mirror (essentially, the online repository where R will first look to download the package source code from), this should install the packages in the same repo where your other packages are kept on your server. Then, either library or require should load your packages.

    (sidenote: rJava is actually a dependency of XLConnect, so it will automatically load anyway if you only specify library(XLConnect))

    2. If the above does not work, try installing the packages via the command line

    This is essentially what @Ben was suggesting with his comment. Alternatively, see perhaps this link, which deals with a similar problem with a different package. If you can, in terminal on the server, I would try entering the following three commands:

    sudo add-apt-repository ppa:marutter/rrutter
    sudo apt-get update
    sudo apt-get install r-cran-XLConnect
    

    In my experience this has been a good go-to repo when I can't seem to find a package I need to install. But you may or may not have permission to install packages on your server instance.