Search code examples
rggplot2ggplotly

How to use facet_wrap or facet_grid to pass random forest autoplot to ggplotly in R


I'm currently running into a bit of trouble trying to tweak the visualizations of my hyperparameter tuning in R.

I've built a few tidymodels workflows based on stock market data (obtained via the quantmod package) — to avoid getting side-tracked, just say the dataframe is called NVDA and we would like to predict the variable Close. After separating the dataframe into training and testing data and constructing the necessary workflows ( elastic_net_wflow_NVDA and random_forest_wflow_NVDA ), I perform a grid-search:

NVDA_folds  <- vfold_cv(NVDA_train, v = 10, strata = Close)

elastic_net_grid <- grid_regular(penalty(range=c(0,1500), trans = identity_trans()), mixture(),
                             levels = 21)

random_forest_grid <- grid_regular(mtry(range = c(1, 14)),
                        trees(range = c(200,700)),
                        min_n(range = c(1,14)),
                        levels = 5)

elastic_net_tune_NVDA  <- tune_grid(
  elastic_net_wflow_NVDA ,
  resamples = NVDA_folds,
  grid = elastic_net_grid
)
random_forest_tune_NVDA <- tune_grid( 
  random_forest_wflow_NVDA,
  resamples = NVDA_folds,
  grid = random_forest_grid
)

Once the grid search completes, I am able to successfully pass the autoplot of my results to the ggplotly() function:

g1<-autoplot(elastic_net_tune_NVDA) + ggtitle("Elastic Net NVDA") + theme_dark()
ggplotly(g1)

which conveniently allows me to interact with the data points on the graph.

enter image description here

However, when I try to do the same with the random forest grid search

g1<-autoplot(random_forest_tune_NVDA) + ggtitle("Random Forest NVDA") + theme_dark()
ggplotly(g1)

I get the error

Error in if (attr(labels, "facet") == "wrap") { : 
  argument is of length zero

After looking on the internet for a bit, it seems the issue somehow lies with assigning "facets" to my plot — either through facet_grid() or facet_wrap() — in such a way that I obtain a similar panel structure to the usual output of just calling autoplot:

enter image description here

The main troulbe is that the variables from the grid search (i.e mtry, min_n, and trees) dont seem to be accessible. I'm wondering if there's any way to correctly wrap this so that I can pass it to a ggplotly object — any help is much appreciated!


Edit: The following R script (linked in Pastebin) should provide a minimal reproduction of the error: https://pastebin.com/0tENiGXg


Solution

  • create the autoplot, then extract the data from it, and re-create the plot using that data.

    g1 <- autoplot(random_forest_tune_NVDA) 
    pp <- ggplot(g1$data, aes(x = value, y = mean, group = `# Trees`, color = `# Trees`)) +
      geom_path() +
      geom_point(size = 1) +
      facet_grid(`.metric` ~ `Minimal Node Size`, scales = "free_y",  labeller = label_both)+
      labs(x = paste(unique(g1$data$name)),
           y = "")
    pp
    
    ggplotly(pp)