The caret package seems to not apply the recipe procedure to remove NAs for cross-validation. I guess that I overlook something...
iris_dt <- as.data.table(iris)
iris_dt[3:5,':='(Petal.Length=NA)]
control <- trainControl(method='cv',number=2,allowParallel = T)
rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())
train(rec,iris_dt,method='lm',trControl = control)
Error in quantile.default(y, probs = seq(0, 1, length = cuts)) : missing values and NaN's not allowed if 'na.rm' is FALSE
It does also not work when the regressor is NA but gives a different error message. When data is prepared and baked and passed to the x/y interface of train(.)
it works.
Many thanks for any hints.
The recipe works fine but the resamples are create prior to the recipe being used. You should remove them prior to calling train
or use the formula method
> iris_dt <- as.data.table(iris)
> iris_dt[3:5,':='(Petal.Length=NA)]
> control <- trainControl(method='cv',number=2,allowParallel = T)
> rec <- recipe(Petal.Length ~ Sepal.Width,iris_dt) %>% step_naomit(all_outcomes(),all_predictors())
> train(Petal.Length ~ Sepal.Width,iris_dt,method='lm',trControl = control, na.action = na.omit)
Linear Regression
150 samples
1 predictor
No pre-processing
Resampling: Cross-Validated (2 fold)
Summary of sample sizes: 74, 73
Resampling results:
RMSE Rsquared MAE
1.610659 0.1885815 1.363651
Tuning parameter 'intercept' was held constant at a value of TRUE