Search code examples
rtidymodels

How to compute importance from a LM using Tidymodels in R?


I am trying to compute variable importance on a linear model using tidymodels. As far as I can tell, the vip package is used to extract the importance when using tidymodels. For example, If I wanted to extract importance from a random forest tidymodels model, I would do the following:

library(tidymodels)

aq <- na.omit(airquality)

model_rf <-
  rand_forest(
    mode = "regression"
  ) %>%
  set_engine("ranger",
             importance = "permutation"
  ) %>%
  fit(Ozone ~ ., data = aq)

# variable importance
vip:::vi(model_rf)

And this will return the importance. However, if I try something similar using a linear model, it throws back an error. For example:

# create model fit
lm_aq_model <- linear_reg() %>%
  set_engine("lm")

lm_fit <- lm_aq_model %>%
  fit(Ozone ~ ., data = aq)


vip:::vi(lm_aq_model, method = "permute", target = "Ozone",
    metric = "rsquared",  pred_wrapper = predict)
> Error in match.call(f, call = mcall) : invalid 'call' argument

What am I doing wrong here? If I try:

aqLM <- lm(Ozone~., data = aq)
vip::vi(aqLM, method = "permute", target = "Ozone",
    metric = "rsquared",  pred_wrapper = predict)

This works... why cant I get it to work with tidymodels?

Also, is this the preferred way to actually extract importance when using tidymodels? Is there a generic tidy models function that I could use instead of using vip::vi(model_fit)?

Thanks


Solution

  • It looks like you are giving it the wrong thing, why not give it the fit? That's what the RF example does.

    vip:::vi(lm_fit, method = "permute", target = "Ozone",
    metric = "rsquared",  pred_wrapper = predict)