I would like to call an R script from Java. I have done google searches on the topic, but almost all of the results I have seen would require me to add a dependency to some third party library. How can I accomplish the same thing without adding any dependencies to my code?
I am using a Windows machine, so perhaps I might use the command line to start R (if it is not already open) and to run a specific R script. But I have never written command line code (or called it from Java) so I would need code examples.
I am including working sample code that I wrote for one possible approach below, using my command line idea. In my in-line-comments below, it can be seen that Step Three in AssembleDataFile.java is intentionally left blank by me. How can I make the command line idea work at this point?
I would be open to another approach that does not involve adding any more dependencies to my code.
Here is what I have so far:
AssembleDataFile.java
import java.io.BufferedReader;
import java.io.FileNotFoundException;
import java.io.FileReader;
import java.io.IOException;
import java.io.PrintWriter;
public class AssembleDataFile {
static String delimiter;
static String localPath = "C:\\test\\cr\\";
static String[][] myDataArray;
public static void main(String[] args) {
String inputPath = localPath+"pd\\";
String fileName = "MSData.txt";
delimiter = "\\t";
// Step One: Import data in two parts
try {
// 1A: get length of data file
BufferedReader br1 = new BufferedReader(new FileReader(inputPath+fileName));
int numRows = 0;
int numCols = 0;
String currentRow;
while ((currentRow = br1.readLine()) != null) {
numRows += 1;
numCols = currentRow.split(delimiter).length;}
br1.close();
//1B: populate data into array
myDataArray = new String[numRows][numCols+1];
BufferedReader br2 = new BufferedReader(new FileReader(inputPath+fileName));
String eachRow;
int rowIdx = 0;
while ((eachRow = br2.readLine()) != null) {
String[] splitRow = eachRow.split(delimiter);
for(int z = 0;z < splitRow.length;z++){myDataArray[rowIdx][z] = splitRow[z];}
rowIdx += 1;}
br2.close();
// Step Two: Write data to csv
String rPath = localPath+"r\\";
String sFileName = rPath+"2colData.csv";
PrintWriter outputWriter = new PrintWriter(sFileName);
for(int q = 0;q < myDataArray.length; q++){
outputWriter.println(myDataArray[q][8]+", "+myDataArray[q][9]);
}
outputWriter.close();
//Step Three: Call R script named My_R_Script.R that uses 2ColData.csv as input
// not sure how to write this code. How can I write this part?
// For what it is worth, one of the R scripts that I intend to call is included below
//
//added the following lines here, per Vincent's suggestion:
String rScriptFileName = rPath+"My_R_Script.R";
Runtime.getRuntime().exec("mypathto\\R\\bin\\Rscript "+rScriptFileName);
//
//
//Step Four: Import data from R and put it into myDataArray's empty last column
try {Thread.sleep(30000);}//make this thread sleep for 30 seconds while R creates the needed file
catch (InterruptedException e) {e.printStackTrace();}
String matchFileName = rPath+"Matches.csv";
BufferedReader br3 = new BufferedReader(new FileReader(matchFileName));
String thisRow;
int rowIndex = 0;
while ((thisRow = br3.readLine()) != null) {
String[] splitRow = thisRow.split(delimiter);
myDataArray[rowIndex][numCols] = splitRow[0];
rowIndex += 1;}
br3.close();
//Step Five: Check work by printing out one row from myDataArray
//Note that the printout has one more column than the input file had.
for(int u = 0;u<=numCols;u++){System.out.println(String.valueOf(myDataArray[1][u]));}
}
catch (FileNotFoundException e) {e.printStackTrace();}
catch (IOException ie){ie.printStackTrace();}
}
}
My_R_Script.R
myCSV <- read.csv(file="2colData.csv",head=TRUE,sep=",")
pts = SpatialPoints(myCSV)
Codes = readShapeSpatial("mypath/myshapefile.shp")
write.csv(ZipCodes$F[overlay(pts,Codes)], "Matches.csv", quote=FALSE, row.names=FALSE)
Here is the error message that is being thrown when I add Runtime.getRuntime().exec("Rscript "+rScriptFileName); to the code above:
java.io.IOException: Cannot run program "Rscript": CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessBuilder.start(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at java.lang.Runtime.exec(Unknown Source)
at AssembleDataFile.main(AssembleDataFile.java:52)
Caused by: java.io.IOException: CreateProcess error=2, The system cannot find the file specified
at java.lang.ProcessImpl.create(Native Method)
at java.lang.ProcessImpl.<init>(Unknown Source)
at java.lang.ProcessImpl.start(Unknown Source)
... 5 more
The code above now works because I followed Vincent's suggestions. However, I had to put in a sleep command in order to give the R script enough time to run. Without the sleep command, the Java code above throws an error saying that the Matches.csv file does not exist. I am concerned that a 30 second sleep period is too rough of an instrument. How can I get the Java program to wait until the R program has a chance to create Matches.csv? I hesitate to use thread tools because I have read that poorly designed threads can cause bugs that are nearly impossible to localize and fix.
You just want to call an external application: wouldn't the following work?
Runtime.getRuntime().exec("Rscript myScript.R");