Search code examples
rmachine-learningtidytidymodels

Applying last_fit function in tidy models


If I understood correctly, to apply last_fit function in tidy package, I need to have a split object that created using rsample::initial_split().

However in a situation that I have separate training data and test data at the very beginning, I don't want to use initial_split function to split the data into training and testing.

Since I cannot create a split object, Couldnt I use last_fit function?


Solution

  • If you want to create an rsplit object from existing testing and training sets, you can use make_splits():

    library(rsample)
    library(dplyr)
    #> 
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    
    data(cells, package = "modeldata")
    
    make_splits(
      cells %>% filter(case == "Train"),
      cells %>% filter(case == "Test")
    )
    #> <Analysis/Assess/Total>
    #> <1009/1010/2019>
    

    Created on 2022-01-16 by the reprex package (v2.0.1)

    Alternatively you can not use last_fit() and manually fit() on the training and predict() on the testing set.