Search code examples
rregressionr-caretensemble-learning

caretEnsemble error: Error in FUN(X[[i]], ...) : { .... is not TRUE


I've been trying to stack together predictions from 2 regression models (glmnet and bagEarth) but I have been getting the "Error in FUN(X[[i]], ...) : { .... is not TRUE" message. Based on what I've read,I've seen this issue stem from resampling indexes, but since I am training the models together, I can't see how I can get the issue. I've been able to replicate using random numbers:

library(caret)
library(caretEnsemble)
rm(list=ls())

training <- as.data.frame(cbind(runif(24,1,100)
,runif(24,1,100)
,runif(24,1,100)
,runif(24,1,100)
,runif(24,1,100)
,runif(24,1,100)))

colnames(training) <- c("y", "x1", "x2", "x3", "x4", "x5")

set.seed(7)
ctrl <- trainControl(method = "cv", number = 3, returnResamp = "all", classProbs = FALSE, index = createMultiFolds(training$y, k = 3, times = 1))
model_list <- caretList(y~., data = training, trControl = ctrl, metric = "RMSE", methodList = c("glmnet", "bagEarth"))
train_ctrl <- trainControl(method = "cv", number = 3, classProbs = FALSE, savePredictions = TRUE, index = createMultiFolds(training$y, k = 3, times = 1))
glm_ensemble <- caretStack(model_list, method = "glm", metric = "RMSE", trControl = train_ctrl)

I know I am probably missing a key element somewhere, any input is appreciated.

Thanks, Anton


Solution

  • A bit of debugging and the error comes from a function called bestPreds. This is a not exported function and looks in the model_lists for the saved predictions (all or final) in the control object. This you have not set in your control object. If you add this, everything will run fine. I do admit that an error message would be nice in this place instead of just throwing an error.

    ctrl <- trainControl(method = "cv", number = 3, returnResamp = "all", 
                         savePredictions = "final",  # needs to be final or all
                         classProbs = FALSE, index = createMultiFolds(training$y, k = 3, times = 1))