I am plotting box-plots of the accuracy scores of resamples of yearly models trained with caret
.
The models are named by the years they refer to: 2000, 2001, 2002, ..., 2010.
I want the models to appear in the box-plots with ascending order based on the year i.e. name of the model.
The summary of resamples based on the below code
fit.year.res <- resamples(fit.year)
summary(fit.year.res)
looks like this:
But then, the different yearly models in the box plot are not sorted:
scales <- list(x=list(relation="free"), y=list(relation="free"))
bwplot(fit.year.res, scales=scales)
I have tried converting the models element of resamples fit.year.res$models
to factor from character, but it didn't make nay difference.
I am not aware of an easy solution using bwplot
method from caret package. Perhaps there is one but my lattice skills are lacking. I recommend plotting the boxplots manually using ggplot2. This way you will have much better control over the final plot.
Since you did not post an example with data I will use one of the examples from ?caret:::bwplot.resamples
library(caret)
library(party)
library(RWeka)
load(url("http://topepo.github.io/caret/exampleModels.RData"))
resamps <- resamples(list(CART = rpartFit,
CondInfTree = ctreeFit,
MARS = earthFit))
bwplot(resamps,
metric = "RMSE")
produces:
To make the plot manually using ggplot you will need some data manipulation:
library(tidyverse)
resamps$values %>% #extract the values
select(1, ends_with("RMSE")) %>% #select the first column and all columns with a name ending with "RMSE"
gather(model, RMSE, -1) %>% #convert to long table
mutate(model = sub("~RMSE", "", model)) %>% #leave just the model names
ggplot()+ #call ggplot
geom_boxplot(aes(x = RMSE, y = model)) -> p1 #and plot the box plot
p1
To set a specific order on the y axis:
p1 +
scale_y_discrete(limits = c("MARS", "CART", "CondInfTree"))
If you prefer lattice
library(lattice)
resamps$values %>%
select(1, ends_with("RMSE")) %>%
gather(model, RMSE, -1) %>%
mutate(model = sub("~RMSE", "", model)) %>%
{bwplot(model ~ RMSE, data = .)}
to change the order change the levels of model (this approach also works with ggplot2):
resamps$values %>%
select(1, ends_with("RMSE")) %>%
gather(model, RMSE, -1) %>%
mutate(model = sub("~RMSE", "", model),
model = factor(model, levels = c("MARS", "CART", "CondInfTree"))) %>%
{bwplot(model ~ RMSE, data = .)}