When I use rpy2 to do the Cubist regression.I met the error:
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
I try to Use as.matrix to change the data format,but it's still unwork.
import rpy2.robjects as robjects
from rpy2.robjects.packages import importr
from rpy2.robjects.vectors import FloatVector
from rpy2.robjects import pandas2ri
Cubist = importr('Cubist')
lattice = importr('lattice')
r = robjects.r
# 准备样点数据
dt = r('mtcars')
Z = FloatVector(dt[3])
X = FloatVector(dt[5])
X1 = FloatVector(dt[6])
T = r['cbind'](X,X1)
regr = r['cubist'](x=T,y=Z,committees=10)
If a matrix, the x
argument to cubist()
seems to require a dimnames
attribute.
Setup in R:
library(Cubist)
dt = mtcars
Z = dt[, 4]
X = dt[, 6]
X1 = dt[, 7]
Now compare this (reproduces your error):
> T = cbind(dt[, 6], dt[, 7])
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
> cubist(x=T, y=Z, committees=10)
cubist code called exit with value 1
Error in strsplit(tmp, "\"")[[1]] : subscript out of bounds
vs.
> T = cbind(X, X1)
> str(T)
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
> cubist(x=T, y=Z, committees=10)
Call:
cubist.default(x = T, y = Z, committees = 10)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
There are multiple ways to ensure the dimnames get attached via rpy2
. One easy way with your code is simply to explicitly name the variables:
In [15]: T = r['cbind'](X=X,X1=X1)
In [16]: print(r['str'](T))
num [1:32, 1:2] 2.62 2.88 2.32 3.21 3.44 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:2] "X" "X1"
<rpy2.rinterface.NULLType object at 0x7f0d7c0f5608> [RTYPES.NILSXP]
In [17]: print(r['cubist'](x=T,y=Z,committees=10))
Call:
cubist.default(x = structure(c(2.62, 2.875, 2.32, 3.215, 3.44, 3.46,
205, 215, 230, 66, 52, 65, 97, 150, 150, 245, 175, 66, 91, 113, 264, 175,
335, 109), committees = 10L)
Number of samples: 32
Number of predictors: 2
Number of committees: 10
Number of rules per committee: 1, 1, 1, 1, 1, 1, 1, 1, 1, 1