Search code examples
rmachine-learningparallel-processingneural-networkr-caret

Pause and resume caret training in R


Let's assume I will do a caret training in R, but I want to split this training in two run sessions.

library(mlbench)
data(Sonar)
library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing  <- Sonar[-inTraining,]

# First run session
nn.partial <- train(Class ~ ., data = training, 
                method = "nnet",
                 max.turns.of.iteration=5) # Non-existent parameter. But represents my goal

Let´s assume that instead the nn full object I have only a partial object that has training information until the turn 5 (i.e. nn.partial). Thus, in future I could run the below code to finish the training job:

library(mlbench)
data(Sonar)
library(caret)
set.seed(998)
inTraining <- createDataPartition(Sonar$Class, p = .75, list = FALSE)
training <- Sonar[ inTraining,]
testing  <- Sonar[-inTraining,]

nn <- train(Class ~ ., data = training, 
                 method = "nnet",
                 previous.training=nn.partial) # Non-existent parameter. But represents my goal

I am aware that the both max.turns.of.iteration and previous.training do not exist in the train function. I am just trying my best to represent in code what would be the ideal world to accomplish my goal if it was already implemented in train function. However, as the parameters are not there, is there a way to achieve this goal (i.e. do the caret training in more than one run) by tricking the function in some way?

I have tried to play with the trainControl function without success.

t.control <- trainControl(repeats=5)
nn <- train(Class ~ ., data = training, 
                 method = "nnet",
trControl = t.control)

By doing that, the number of iteration turns is still much higher than 5, as I would like to obtain in my example.


Solution

  • I am almost certain that this is very complicated to implement in carets current infrastructure. However I will show you how to achieve this sort of thing out of the box with mlr3.

    required packages for the example

    library(mlr3)
    library(mlr3tuning)
    library(paradox)
    

    get an example task and define a learner to be tuned:

    task_sonar <- tsk('sonar')
    learner <- lrn('classif.rpart', predict_type = 'prob')
    

    define the hyper parameters to be tuned:

    ps <- ParamSet$new(list(
      ParamDbl$new("cp", lower = 0.001, upper = 0.1),
      ParamInt$new("minsplit", lower = 1, upper = 10)
    ))
    

    define the tuner and resampling strategy

    tuner <- tnr("random_search")
    cv3 <- rsmp("cv", folds = 3)
    

    define the tuning instance

    instance <- TuningInstance$new(
      task = task_sonar,
      learner = learner,
      resampling = cv3,
      measures = msr("classif.auc"),
      param_set = ps,
      terminator = term("evals", n_evals = 100) #one can combine multiple terminators such as clock time, number of evaluations, early stopping (stagnation), performance reached - ?Terminator
    )
    

    tune:

    tuner$tune(instance)
    

    now press stop after a second to stop the task in Rstudio

    instance$archive()
    
        nr batch_nr  resample_result task_id    learner_id resampling_id iters params tune_x warnings errors classif.auc
     1:  1        1 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7105586
     2:  2        2 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7372720
     3:  3        3 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7335368
     4:  4        4 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7335368
     5:  5        5 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7276246
     6:  6        6 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7111217
     7:  7        7 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.6915560
     8:  8        8 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7452875
     9:  9        9 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7372720
    10: 10       10 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7172328
    

    in my case it finished 10 iterations of random search. You can now for instance call

    save.image()
    

    close RStudio and reopen the same project

    or use saveRDS/readRDS on the objects you wish to keep

    saveRDS(instance, "i.rds")
    instance <- readRDS("i.rds")
    

    after loading the required packages resume training with

    tuner$tune(instance)
    

    stop it again after few seconds:

    in my case it finished an additional 12 iterations:

    instance$archive()
    
        nr batch_nr  resample_result task_id    learner_id resampling_id iters params tune_x warnings errors classif.auc
     1:  1        1 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7105586
     2:  2        2 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7372720
     3:  3        3 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7335368
     4:  4        4 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7335368
     5:  5        5 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7276246
     6:  6        6 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7111217
     7:  7        7 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.6915560
     8:  8        8 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7452875
     9:  9        9 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7372720
    10: 10       10 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7172328
    11: 11       11 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7325289
    12: 12       12 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7105586
    13: 13       13 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7215133
    14: 14       14 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.6915560
    15: 15       15 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.6915560
    16: 16       16 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7335368
    17: 17       17 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7276246
    18: 18       18 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7111217
    19: 19       19 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7172328
    20: 20       20 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7276246
    21: 21       21 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7105586
    22: 22       22 <ResampleResult>   sonar classif.rpart            cv     3 <list> <list>        0      0   0.7276246
    

    Run it again without pressing stop

    tuner$tune(instance)
    

    and it will finish the 100 evals

    Limitation: The above example splits the tuning (evaluation of hyper-parameters) to multiple sessions). However it does not split one training instance into multiple sessions - very few packages support this kind of thing in R - keras/tensorflow are the only one I know of.

    However regardless of the length of one training instance for an algorithm, the tuning (evaluation of hyper parameters) of such an algorithm takes much more time so it is more advantageous to be able to pause/resume the tuning as in the above example.

    If you find this interesting here are some resources to learn mlr3

    https://mlr3book.mlr-org.com/
    https://mlr3gallery.mlr-org.com/

    Take a look also at mlr3pipelines - https://mlr3pipelines.mlr-org.com/articles/introduction.html