Search code examples
tidymodelsr-recipes

Mysterious error with tidymodels recipe using "step_interact"


I have been trying to create a recipe to train a model for the data set ames, but I am encountering an error when I try to fit the model and I don't know what it is. This is a MWE

library(tidyverse)
library(tidymodels)


# Load dataset
data("ames")

ames <- ames |> 
    mutate(Sale_Price = log10(Sale_Price))

# Split the data frame into train/test
set.seed(123)
ames_split <- initial_split(ames, prop = 0.80)
ames_train <- training(ames_split)
ames_test <- testing(ames_split)

# Model Specification
model_spec <- linear_reg() |> 
    set_engine("lm")

# Recipe
ames_rec <- recipe(Sale_Price ~ ., data = ames_train) |> 
    step_log(Gr_Liv_Area, base = 10) |> 
    step_dummy(all_nominal_predictors()) |> 
    step_interact( ~ Gr_Liv_Area : starts_with("Bldg_Type")) |> 
    step_zv(all_numeric_predictors()) |> 
    step_normalize(all_numeric_predictors()) |> 
    step_pca(matches("(SF$)|(^Bsmt)|(^Garage)"), num_comp = 5) |> 
    prep()

# Workflow
ames_wflow <- workflow() |> 
    add_model(model_spec) |> 
    add_recipe(ames_rec)

# Train the model
model_fit <-  fit(ames_wflow, ames_train)

When I run this code it gives me the following error:

Error in `step_interact()`:
Caused by error in `str2lang()`:
! <text>:2:0: unexpected end of input
1: ~
   ^
Run `rlang::last_trace()` to see where the error occurred.

Can you explain me what I am doing wrong?


Solution

  • Without the prep(), it appears to work:

    library(tidymodels)
    
    # Load dataset
    data("ames")
    
    ames <- ames |> 
      mutate(Sale_Price = log10(Sale_Price))
    
    # Split the data frame into train/test
    set.seed(123)
    ames_split <- initial_split(ames, prop = 0.80)
    ames_train <- training(ames_split)
    ames_test <- testing(ames_split)
    
    # Model Specification
    model_spec <- linear_reg() |> 
      set_engine("lm")
    
    # Recipe
    ames_rec <- recipe(Sale_Price ~ ., data = ames_train) |> 
      step_log(Gr_Liv_Area, base = 10) |> 
      step_dummy(all_nominal_predictors()) |> 
      step_interact( ~ Gr_Liv_Area : starts_with("Bldg_Type")) |> 
      step_zv(all_numeric_predictors()) |> 
      step_normalize(all_numeric_predictors()) |> 
      step_pca(matches("(SF$)|(^Bsmt)|(^Garage)"), num_comp = 5)
    
    # Workflow
    ames_wflow <- workflow() |> 
      add_model(model_spec) |> 
      add_recipe(ames_rec)
    
    # Train the model
    model_fit <-  fit(ames_wflow, ames_train)
    class(model_fit)
    #> [1] "workflow"
    

    Created on 2023-11-28 with reprex v2.0.2