Search code examples
rr-carettidymodels

how to get beta estimates when I convert from R caret to tidymodels


I don't see an easy way to get the parameter estimates out of a crossvalidated model using the tidymodels ecosystem. How do I do it?

With caret I can do a cross validated model and get the parameter estimates like this:

library(caret)
library(tidyverse)
library(tidymodels)
data(ames)

set.seed(123)
mod_wo_Garage_Cars <- train(
  Sale_Price ~ .,
  data = select(ames, -Garage_Cars),
  method="lm",
  trControl=trainControl(method= "cv", number = 10)
)

summary(mod_wo_Garage_Cars) %>% 
  broom::tidy() %>% 
  filter(term == "Garage_Area")

I have a workflow that I think does the same modeling (give or take differences in train() vs. vfold_cv() resamples):

library(tidyverse)
library(tidymodels)
data(ames)

set.seed(123)
folds <- vfold_cv(ames, v = 10)

the_rec <- 
  recipe(Sale_Price ~ ., data = ames) %>% 
  step_rm(Garage_Cars)

the_lm_model <- 
  linear_reg() %>% 
  set_engine("lm")

the_workflow <- 
  workflow() %>% 
  add_recipe(the_rec) %>% 
  add_model(the_lm_model) 

mod_wo_Garage_Cars <- 
  fit_resamples(the_workflow, folds) 

I can see how to get the RMSE with show_best(mod_wo_Garage_Cars, metric = "rmse"). How do I get the overall model estimates on the beta's out of this workflow?


Solution

  • You need to pull out the coefficients from your fitted model and then tidy it.

    best_rmse <- the_workflow %>% 
      fit_resamples(folds) %>% 
      select_best('rmse')
    
    the_workflow %>% 
      finalize_workflow(best_rmse) %>% 
      fit(ames) %>% 
      pull_workflow_fit() %>% 
      tidy()