I'm getting this error using the 'ranger' method, as well as with similar others, for RFE feature selection in R. I've already tried to eliminate highly correlated features, nzv filtering, change the method, use a weight matrix, but I always have similar error. The RFE runs a few folds, but then, stops.
variable.sizes <- c(2,5,50,500)
control <- rfeControl(functions = caretFuncs, method = "cv",
verbose = TRUE, returnResamp = "all",
number = num.iters)
results.rfe <- rfe(x = featureVars, y = classVars,
sizes = variable.sizes,
rfeControl = control, trControl = trainControl(method = "cv"),
preProcess=c("scale","center"), method="ranger")
featureVars is a data frame, I tried with matrix too, with 334 rows and classVars is a factor with 3 levels and 334 items. The rfe execution passes through the parse stage and run a few folds, and then, stops, as on this output.
+(rfe) fit Fold1 size: 992
-(rfe) fit Fold1 size: 992
+(rfe) imp Fold1
+(rfe) fit Fold2 size: 992
Error in { : task 1 failed - "No importance values available"
This is the sessionInfo, I've updated all the dependencies for the imported packages.
> sessionInfo()
R version 4.0.3 (2020-10-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.2 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
[5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 LC_PAPER=en_US.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] ranger_0.12.1 dplyr_1.0.5 e1071_1.7-6 caret_6.0-87 ggplot2_3.3.3 lattice_0.20-41
loaded via a namespace (and not attached):
[1] Rcpp_1.0.6 pillar_1.5.1 compiler_4.0.3 gower_0.2.2 plyr_1.8.6
[6] iterators_1.0.13 class_7.3-18 tools_4.0.3 rpart_4.1-15 ipred_0.9-11
[11] lubridate_1.7.10 lifecycle_1.0.0 tibble_3.1.0 gtable_0.3.0 nlme_3.1-151
[16] pkgconfig_2.0.3 rlang_0.4.10 Matrix_1.3-2 foreach_1.5.1 DBI_1.1.1
[21] prodlim_2019.11.13 stringr_1.4.0 withr_2.4.1 pROC_1.17.0.1 generics_0.1.0
[26] vctrs_0.3.7 recipes_0.1.15 stats4_4.0.3 nnet_7.3-15 grid_4.0.3
[31] tidyselect_1.1.0 data.table_1.14.0 glue_1.4.2 R6_2.5.0 fansi_0.4.2
[36] survival_3.2-7 lava_1.6.9 reshape2_1.4.4 purrr_0.3.4 magrittr_2.0.1
[41] ModelMetrics_1.2.2.2 splines_4.0.3 MASS_7.3-53 scales_1.1.1 codetools_0.2-18
[46] ellipsis_0.3.1 assertthat_0.2.1 timeDate_3043.102 colorspace_2.0-0 utf8_1.2.1
[51] proxy_0.4-25 stringi_1.5.3 munsell_0.5.0 crayon_1.4.1
With ranger, you need to specify the importance measure so that importance values are calculated, for example importance = "impurity"
, see help page for more information.
Below I use an example dataset with the importance
argument specified and you can see it works:
x = cbind(iris[,1:4],matrix(rnorm(nrow(iris),6),ncol=6))
y = iris$Species
variable.sizes <- c(2,4,6)
control <- rfeControl(functions = caretFuncs, method = "cv",
verbose = TRUE, returnResamp = "all",
number = 5)
results.rfe <- rfe(x = x, y = y,
sizes = variable.sizes,
rfeControl = control,
trControl = trainControl(method = "cv"),
preProcess=c("scale","center"), method="ranger",
importance = "impurity")
Output:
+(rfe) fit Fold1 size: 10 -(rfe) fit Fold1 size: 10 +(rfe) imp Fold1 -(rfe) imp Fold1 +(rfe) fit Fold1 size: 6 -(rfe) fit Fold1 size: 6 +(rfe) fit Fold1 size: 4 -(rfe) fit Fold1 size: 4 +(rfe) fit Fold1 size: 2 note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
-(rfe) fit Fold1 size: 2 +(rfe) fit Fold2 size: 10 -(rfe) fit Fold2 size: 10 +(rfe) imp Fold2 -(rfe) imp Fold2 +(rfe) fit Fold2 size: 6 -(rfe) fit Fold2 size: 6 +(rfe) fit Fold2 size: 4 -(rfe) fit Fold2 size: 4 +(rfe) fit Fold2 size: 2 note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
-(rfe) fit Fold2 size: 2
+(rfe) fit Fold3 size: 10
-(rfe) fit Fold3 size: 10
+(rfe) imp Fold3
-(rfe) imp Fold3
+(rfe) fit Fold3 size: 6
-(rfe) fit Fold3 size: 6
+(rfe) fit Fold3 size: 4
-(rfe) fit Fold3 size: 4
+(rfe) fit Fold3 size: 2
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
-(rfe) fit Fold3 size: 2
+(rfe) fit Fold4 size: 10
-(rfe) fit Fold4 size: 10
+(rfe) imp Fold4
-(rfe) imp Fold4
+(rfe) fit Fold4 size: 6
-(rfe) fit Fold4 size: 6
+(rfe) fit Fold4 size: 4
-(rfe) fit Fold4 size: 4
+(rfe) fit Fold4 size: 2
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
-(rfe) fit Fold4 size: 2
+(rfe) fit Fold5 size: 10
-(rfe) fit Fold5 size: 10
+(rfe) imp Fold5
-(rfe) imp Fold5
+(rfe) fit Fold5 size: 6
-(rfe) fit Fold5 size: 6
+(rfe) fit Fold5 size: 4
-(rfe) fit Fold5 size: 4
+(rfe) fit Fold5 size: 2
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .
-(rfe) fit Fold5 size: 2
note: only 1 unique complexity parameters in default grid. Truncating the grid to 1 .