This is a beginner's question but how do you save a 2d numpy array to a file in (compressed) R format using rpy2? To be clear, I want to save it in rpy2 and then later read it in using R. I would like to avoid csv as the amount of data will be large.
Looks like you want the save command. I would use the pandas R interface and do something like the following.
import numpy as np
from rpy2.robjects import r
import pandas.rpy.common as com
from pandas import DataFrame
a = np.array([range(5), range(5)])
df = DataFrame(a)
df = com.convert_to_r_dataframe(df)
r.assign("foo", df)
r("save(foo, file='here.gzip', compress=TRUE)")
There may be a more elegant way, though. I'm open to better suggestions. The above, in R
would be used:
> load("here.gzip")
> foo
X0 X1 X2 X3 X4
0 0 1 2 3 4
1 0 1 2 3 4
You can bypass the use of pandas
and use numpy2ri from rpy2
. With something like:
from rpy2.robjects import r
from rpy2.robjects.numpy2ri import numpy2ri
a = np.array([[i*2147483647**2 for i in range(5)], range(5)], dtype="uint64")
a = np.array(a, dtype="float64") # <- convert to double precision numeric since R doesn't have unsigned ints
ro = numpy2ri(a)
r.assign("bar", ro)
r("save(bar, file='another.gzip', compress=TRUE)")
In R
then:
> load("another.gzip")
> bar
[,1] [,2] [,3] [,4] [,5]
[1,] 0 4.611686e+18 9.223372e+18 1.383506e+19 1.844674e+19
[2,] 0 1.000000e+00 2.000000e+00 3.000000e+00 4.000000e+00