Search code examples
rmachine-learningtidymodels

In r with tidymodels: Warning message: "All models failed in [fit_resamples()]. See the `.notes` column." internal: Error: In metric: `roc_auc`


I am new in R and trying to learn tidymodels.

I am getting this error only with glm for iris dataset and if I change dataset & recipe then glm is running fine but then I start to get this error in kknn.

Warning message:
"All models failed in [fit_resamples()]. See the `.notes` column."
Warning message:
"This tuning result has notes. Example notes on model fitting include:
internal: Error: In metric: `roc_auc`

I checked .notes and this is how it looks:

.notes
<chr>
internal: Error: In metric: `roc_auc`
A tibble: 1 × 1 .notes
<chr>
internal: Error: In metric: `roc_auc`
A tibble: 1 × 1

Warning message: All models failed in [fit_resamples()]. See the `.notes` column

As it was suggested in above post I tried to upgrade parsnip & tune packages from github but getting error on installing tune package: Warning in install.packages : package ‘tune’ is not available for this version of R

I am not sure what's wrong, appreciate if someone can help !!!

Version information:

-- Attaching packages --------------------------------------- tidyverse 1.3.0 --

v ggplot2 3.3.2     v purrr   0.3.4
v tibble  3.0.4     v dplyr   1.0.2
v tidyr   1.1.2     v stringr 1.4.0
v readr   1.4.0     v forcats 0.5.0

-- Conflicts ------------------------------------------ tidyverse_conflicts() --
x dplyr::filter() masks stats::filter()
x dplyr::lag()    masks stats::lag()

-- Attaching packages -------------------------------------- tidymodels 0.1.1 --

v broom     0.7.2          v recipes   0.1.14    
v dials     0.0.9          v rsample   0.0.8     
v infer     0.5.3          v tune      0.1.1     
v modeldata 0.0.2          v workflows 0.2.1     
v parsnip   0.1.3.9000     v yardstick 0.0.7     

-- Conflicts ----------------------------------------- tidymodels_conflicts() --
x scales::discard() masks purrr::discard()
x dplyr::filter()   masks stats::filter()
x recipes::fixed()  masks stringr::fixed()
x dplyr::lag()      masks stats::lag()
x yardstick::spec() masks readr::spec()
x recipes::step()   masks stats::step()


Windows 7
platform       x86_64-w64-mingw32          
arch           x86_64                      
os             mingw32                     
system         x86_64, mingw32             
status                                     
major          4                           
minor          0.3                         
year           2020                        
month          10                          
day            10                          
svn rev        79318                       
language       R                           
version.string R version 4.0.3 (2020-10-10)

Code:

library(tidyverse)
library(tidymodels)
library(themis)

iris

# Data split
set.seed(999)

iris_split <- initial_split(iris, strata = Species)

iris_train <- training(iris_split)
iris_test <- testing(iris_split)


# Cross Validation
set.seed(345)

iris_fold <- vfold_cv(iris_train)
print(iris_fold)


# recipe
iris_rec <- recipe(Species ~., data = iris_train) %>%

  #make sure the training set has equal numbers of target variale (not needed for iris dataset)
  step_downsample(Species) %>% 

  #normalise the data
  step_center(-Species) %>% 
  step_scale(-Species) %>% 
  step_BoxCox(-Species) %>% 

  #function to apply the recipe to the data
  prep()


# Workflow
iris_wf <- workflow() %>%
    add_recipe(iris_rec)

# logistic
glm_spec <- logistic_reg() %>%
  set_engine("glm")


# to do parallel processing
doParallel::registerDoParallel()

# adding parameters to workflow
glm_rs <- iris_wf %>%
  add_model(glm_spec) %>%
  fit_resamples(
      resamples = iris_fold,
      metrics = metric_set(roc_auc, accuracy, sensitivity, specificity),
      control = control_resamples(save_pred = TRUE)
  )

ERROR

Warning message:
"All models failed in [fit_resamples()]. See the `.notes` column."
Warning message:
"This tuning result has notes. Example notes on model fitting include:
internal: Error: In metric: `roc_auc`

internal: Error: In metric: `roc_auc`

internal: Error: In metric: `roc_auc`"

# Resampling results
# 10-fold cross-validation 
# A tibble: 10 x 5
   splits           id     .metrics .notes           .predictions
   <list>           <chr>  <list>   <list>           <list>      
 1 <split [102/12]> Fold01 <NULL>   <tibble [1 x 1]> <NULL>      
 2 <split [102/12]> Fold02 <NULL>   <tibble [1 x 1]> <NULL>      
 3 <split [102/12]> Fold03 <NULL>   <tibble [1 x 1]> <NULL>      
 4 <split [102/12]> Fold04 <NULL>   <tibble [1 x 1]> <NULL>      
 5 <split [103/11]> Fold05 <NULL>   <tibble [1 x 1]> <NULL>      
 6 <split [103/11]> Fold06 <NULL>   <tibble [1 x 1]> <NULL>      
 7 <split [103/11]> Fold07 <NULL>   <tibble [1 x 1]> <NULL>      
 8 <split [103/11]> Fold08 <NULL>   <tibble [1 x 1]> <NULL>      
 9 <split [103/11]> Fold09 <NULL>   <tibble [1 x 1]> <NULL>      
10 <split [103/11]> Fold10 <NULL>   <tibble [1 x 1]> <NULL>      

(UPDATE)

Getting error with RF even without using Parallel compute

enter image description here


Solution

  • I had the same issue on a Linux machine but solved it with the removal of NAs or their imputation. So, it seems that the presence of NAs is causing the model fitting failure! :)