Search code examples
javarenjin

Renjin/Java - Vector class to Java array class


I’m currently using Ro.getElementAsDouble(index) (code is below) to extract the values in Ro one by one into LogNormArray (double[]). But I worry that a copy is wasting memory. I’m very much a newbie to java so apologies if I’m thinking about this incorrectly. In C I would copy a single address to "share" the data instead of coping the whole array. Is there an analogous pointer/address-passing workflow?

Toy Example I’m working with:

import java.util.Arrays;
import javax.script.*;
import org.renjin.sexp.*;
import org.renjin.script.*;

public class VectorRenjin {
  public static void main(String[] args) throws Exception {
    double[] LogNormArray = new double[10];
    Vector Ro = LogNormalRand(10,0.0,1.0);
    System.out.println("draws from log-normal " + Ro);
    for(int i=0;i< Ro.length();i++) LogNormArray[i] = Ro.getElementAsDouble(i);
    System.out.println("draws from log-normal " + Arrays.toString(LogNormArray));
  }
  public static Vector LogNormalRand(int n,double a, double b) throws Exception{
    // create a script engine manager:
    RenjinScriptEngineFactory factory = new RenjinScriptEngineFactory();
    ScriptEngine engine = factory.getScriptEngine();
    engine.put("n",n);
    engine.put("a",a);
    engine.put("b",b);
    Vector res = (Vector)engine.eval("rlnorm(n,a,b)");
    return res;
  }
}

Solution

  • Note that a Vector result is not guaranteed to be backed by an array. For example, the statement engine.eval("1:10000") currently evaluates to an IntSequence instance. But this is an implementational detail subject to change.

    The result of your evaluation above will always be an instance of AtomicVector, an interface implemented by all R values of type double, character, integer, complex, etc. The AtomicVector interface has a toDoubleArray method which will copy or create a double array containing the values.

    toDoubleArray() does not return a reference to the array, even if the Vector is backed by an array, because Renjin Vector objects are meant to be immutable so they can be shared between threads.

    Calling toDoubleArray() will probably still be more efficient than your code above because it only involves one virtual dispatch and many implementations rely on Arrays.copyOf() which is faster than filling an array element-by-element.

    If performance becomes a life-or-death matter, then you can check to see if the result of the evaluation is a DoubleArrayVector, and then call toDoubleArrayUnsafe(). But then you have to pinky-swear not to modify the contents.