Search code examples
rmachine-learningrandom-forestfeature-selectionparty

varimp (R partykit) retuns errors for conditional importance setting


Firstly, I constructed a model by

cf1 <- cforest(y~., data = DATA, strata = DATA$y,
           ntree = 200L, mtry = 10)

Here considering the dataset is very imbalanced (y=1 takes 7% of the whole observations), so I add strata here to make sure observations with y=1 are not ignored in bagging. cf1 works normally, in terms of the confusion matrix. However, when I tried to implement feature selection by

cf1.imp_cond <- varimp(cf1, conditional = TRUE)

It returns

Error in x[strata == s] <- .resample(x[strata == s]) : 
NAs are not allowed in subscripted assignments

I can't figure out what does this error mean. Someone met this before?

----update

Here is an manipulated test data from the original dataset I am using. Here is the code

cf2 <- cforest(X5_years_survival~., data = test, strata = X5_years_survival,
           ntree = 200L, mtry = 6)
cf2.imp_cond <- varimp(cf2, conditional = TRUE)

Still, I have the error:

Error in x[strata == s] <- .resample(x[strata == s]) : 
NAs are not allowed in subscripted assignments

---update

The error occurs when kidids_node function is applied.


Solution

  • The truth is, if I keep all integer type covariate, instead of converting them by as.factor, applying varimp makes no error.