Search code examples
rggplot2boxploterrorbarviolin-plot

plot horizontal cutoff lines on top of boxplots


I have a violin+boxplots plot I do with the iris data as an example, like this:

data(iris)
iris_long <- as.data.frame(tidyr::pivot_longer(iris, -Species, names_to = "variable", values_to = "value"))
P <- ggplot2::ggplot(iris_long, ggplot2::aes(x=variable, y=value)) +
  ggplot2::geom_violin() +
  ggplot2::geom_boxplot(width=0.2, alpha=0, outlier.shape=NA, coef=0, show.legend=FALSE) +
  ggplot2::stat_summary(fun=mean, geom="point", shape=16, size=2, show.legend=FALSE) +
  ggplot2::theme_light()
grDevices::pdf(file="test.pdf", height=4, width=8)
print(P)
grDevices::dev.off()

Which produces this plot:

1

Now I also have a data frame with cutoff values per variable level, like this one:

cutoffs <- data.frame(variable=sort(names(iris)[-length(names(iris))]), cutoff=c(7,2.5,8,4.5))

I just want to plot them as red cutoff bars on top of each violin+boxplot, so the result looks similar to this:

2

What would be the best way to go about this? geom_segment? or is there an easier way with maybe geom_errorbar?

EDIT

Now I am trying to accomplish the same, but including the Species information in the violin+boxplots plot. Once I add fill=Species (with the data formatted appropriately), I get an error with geom_segment that I do not know how to debug.

Besides, even without geom_segment the violins and the boxplots do not align well.

How can I debug the error below and align the violins and boxplots correctly?

Note that for the segments I still just want one cutoff per variable level (one for Sepal height, one for Sepal width, one for Petal height, and one for Petal Width).

This is the new code:

data(iris)
iris_long <- as.data.frame(tidyr::pivot_longer(iris, cols=names(iris)[-length(names(iris))],
                                               names_to = "variable", values_to = "value"))
cutoffs <- data.frame(variable=sort(names(iris)[-length(names(iris))]), cutoff=c(7,2.5,8,4.5))

P <- ggplot2::ggplot(iris_long, ggplot2::aes(x=variable, y=value, fill=Species)) +
  ggplot2::geom_violin() +
  ggplot2::geom_boxplot(width=0.2, alpha=0, outlier.shape=NA, coef=0, show.legend=FALSE) +
  ggplot2::stat_summary(fun=mean, geom="point", shape=16, size=2, show.legend=FALSE) +
  ggplot2::geom_segment(data=cutoffs, ggplot2::aes(x=as.numeric(as.factor(variable)) - .25,
                                                   xend=as.numeric(as.factor(variable)) + .25,
                                                   y=cutoff, yend=cutoff), color = "red", alpha=0.75) +
  ggplot2::theme_light()
grDevices::pdf(file="test.pdf", height=4, width=8)
print(P)
grDevices::dev.off()

And the error:

Error in ggplot2::geom_segment(): ! Problem while computing aesthetics. ℹ Error occurred in the 4th layer. Caused by error: ! object 'Species' not found Run rlang::last_trace() to see where the error occurred.


Solution

  • Yes, I would simply add a geom_segment:

    ggplot(iris_long, aes(x = variable, y = value)) +
      geom_violin() +
      geom_boxplot(width = 0.2, alpha = 0, outlier.shape = NA, coef = 0, show.legend = FALSE) +
      stat_summary(fun = mean, geom = "point", shape = 16, size = 2, show.legend = FALSE) +
      geom_segment(data = cutoffs, 
                   aes(x = as.numeric(as.factor(variable)) - .25, 
                       xend = as.numeric(as.factor(variable)) + .25, 
                       y = cutoffs, yend = cutoffs), 
                   color = "red") +
      theme_light()
    

    which produces this plot:

    A boxplot ammended with a violin plot and some red lines on top of it

    N.B. I guess that your cutoffs are a bit messed up (hence the lines do not line up)


    To incoprorate the fill aesthetic you need to provide an explicit position_dodge to geom_violin, geom_boxplot and stat_summary. Furthermore, you should set the fill aesthatic for the segment to NULL otherwise it tries to look for it:

    ggplot(iris_long, aes(x = variable, y = value, fill = Species)) +
      geom_violin(position = position_dodge(width = 1)) +
      geom_boxplot(width = 0.2, alpha = 0, outlier.shape = NA, coef = 0, show.legend = FALSE,
                   position = position_dodge(width = 1)) +
      stat_summary(fun = mean, geom = "point", shape = 16, size = 2, show.legend = FALSE,
                   position = position_dodge(width = 1)) +
      geom_segment(data = cutoffs, 
                   aes(x = as.numeric(as.factor(variable)) - .25, 
                       xend = as.numeric(as.factor(variable)) + .25, 
                       y = cutoff, yend = cutoff, fill = NULL), 
                   color = "red") +
      theme_light()
    

    produces:

    A boxplot ammended with a violin plot colored by species and some red lines on top of it