Search code examples
rsortingr-caretbwplot

sort `caret` models in `bwplot()`


I am plotting box-plots of the accuracy scores of resamples of yearly models trained with caret. The models are named by the years they refer to: 2000, 2001, 2002, ..., 2010. I want the models to appear in the box-plots with ascending order based on the year i.e. name of the model.

The summary of resamples based on the below code

fit.year.res <- resamples(fit.year)
summary(fit.year.res)

looks like this:

enter image description here

But then, the different yearly models in the box plot are not sorted:

scales <- list(x=list(relation="free"), y=list(relation="free"))
bwplot(fit.year.res, scales=scales)

enter image description here

I have tried converting the models element of resamples fit.year.res$models to factor from character, but it didn't make nay difference.


Solution

  • I am not aware of an easy solution using bwplot method from caret package. Perhaps there is one but my lattice skills are lacking. I recommend plotting the boxplots manually using ggplot2. This way you will have much better control over the final plot.

    Since you did not post an example with data I will use one of the examples from ?caret:::bwplot.resamples

    library(caret)
    library(party)
    library(RWeka)
    
    load(url("http://topepo.github.io/caret/exampleModels.RData"))
    
    resamps <- resamples(list(CART = rpartFit,
                              CondInfTree = ctreeFit,
                              MARS = earthFit))
    
    bwplot(resamps,
           metric = "RMSE")
    

    produces:

    enter image description here

    To make the plot manually using ggplot you will need some data manipulation:

    library(tidyverse)
    resamps$values %>% #extract the values
      select(1, ends_with("RMSE")) %>% #select the first column and all columns with a name ending with "RMSE"
      gather(model, RMSE, -1) %>% #convert to long table
      mutate(model = sub("~RMSE", "", model)) %>% #leave just the model names
      ggplot()+ #call ggplot
      geom_boxplot(aes(x = RMSE, y = model)) -> p1 #and plot the box plot
    
    p1
    

    enter image description here

    To set a specific order on the y axis:

    p1 +
      scale_y_discrete(limits = c("MARS", "CART", "CondInfTree"))
    

    enter image description here

    If you prefer lattice

    library(lattice)
    
    resamps$values %>%
      select(1, ends_with("RMSE")) %>%
      gather(model, RMSE, -1) %>%
      mutate(model = sub("~RMSE", "", model)) %>%
      {bwplot(model ~ RMSE, data = .)}
    

    enter image description here

    to change the order change the levels of model (this approach also works with ggplot2):

    resamps$values %>%
      select(1, ends_with("RMSE")) %>%
      gather(model, RMSE, -1) %>%
      mutate(model = sub("~RMSE", "", model),
             model = factor(model, levels = c("MARS", "CART", "CondInfTree"))) %>%
        {bwplot(model ~ RMSE, data = .)}
    

    enter image description here