So I am struggling to use the rpy2
package to integrate some workflow between R and Python.
For example, imagine I'm trying to run a Python script that does a linear regression in R and I want to return all of the elements of that (in reality I'm trying to do something much more complicated than this).
I execute the following in Python (calling R; This assumes you have rpy2
installed):
import rpy2.robjects as ro
test = ro.r('''
# Load in data
df <- mtcars
# Run regression
out = lm(formula='mpg ~ cyl + hp + wt',data=df)
''')
And now what? I have a questions:
How do I pull the various elements from the result? In R they would be out$coefficients
and out$residuals
, etc. I know there is documentation for this, but I'm a bit lost. Ideally, I would want the elements in useful formats, so pandas dataframes or indexed lists, etc.
What happens to df
? robjects.r()
seems to just save whatever the last thing you gave and throw away everything else. I suppose I can work with this, but it's not ideal.
Related to 2: Is there a much much better way to do this? In general if someone could put forward a "best practice" for this sort of thing, that would be helpful, since I'm sure that there are many people interested in using Python, but occasionally have a very custom function they need to call using R but they don't want to get to fancy with the integration. Perhaps a way to call an R function using Pythonic input arguments would be great.
Q.1: How do I pull the various elements from the result?
Ans.1: After you run your R script:
test = ro.r(your_R_script)
You can use this code to print out all the names
and values
in the test
object.
# iterate on names and values
# be careful output is v long
for n,v in test.items():
print(n)
print(v)
To list all available names
, run this code:
test.names
The output:
StrVector with 12 elements.
'coeffici... 'residuals' 'effects' 'rank' ... 'xlevels' 'call' 'terms' 'model'
To print values of the 'residuals', run this:
test[test.names.index('residuals')]
Q.2: What happens to df?
Ans.2: It is still available in R environment until you delete it. You can run simple R code to check:
ro.r('''
# View dataframe
df
''')
Q.3: Is there a much much better way to do this?
Ans.3: (No answer.)