Search code examples
rggplot2ggplotly

Add custom lines to geom_violin in ggplot / ggplotly


I have thresholds for each of the groups of my data, and when plotting a violin plot I would like these to be shown within the plots.

Instead, the lines are not constrained to the corresponding groups and the lines are only shown either full width (ggplot), or a small section of the whole plot (ggplotly).

I think that stat_sina from the ggforce package is the way forward, however I am having no luck.

My code is:

library(tidyverse)
library(plotly)

customLines <- data.frame(name = c('Petal.Length', 'Petal.Width', 'Sepal.Length', 'Sepal.Width'),
                          value = c(2, 1, 5, 3))

iris %>%
  pivot_longer(-Species) %>%
  ggplot(aes(x = name,
             y = value,
             fill = name,
             colour = name,
             group = name)) +
  geom_violin(alpha = 0.3) +
  ggforce::geom_sina() +
  ggforce::stat_sina(data = customLines,
                     aes(x = name,
                         yintercept = value),
                     geom = "hline")
ggplotly()

Which outputs

enter image description here

As is clear to see, this doesn't correspond with the violin plots.

How can I have the lines only go the full width of the violin plots, and within the violins? It might be similar to the quantile approach, but I have no clue how to use this in this format either.


Solution

  • My solution uses ggplot_build to identify the correct width of the violin plot, then use that information in geom_segment. ggplotly also works with my solution (figure not shown).

    library(tidyverse)
    customLines <- data.frame(name = c('Petal.Length', 'Petal.Width', 'Sepal.Length', 'Sepal.Width'),
                              value = c(2, 1, 5, 3))
    
    customLines <- customLines %>% 
      mutate(fac = as.integer(factor(name, levels = name)))
    
    p1 <- iris %>%
      pivot_longer(-Species) %>%
      ggplot(aes(x = name,
                 y = value,
                 fill = name,
                 colour = name,
                 group = name)) +
      geom_violin(alpha = 0.3)
    
    violin_width <- ggplot_build(p1)$data[[1]] %>% 
      group_by(x) %>% 
      summarize(width = max(violinwidth)) %>% 
      left_join(customLines, by = c("x" = "fac"))
    
    p1 + 
      geom_segment(data = violin_width, 
                   aes(x = x - width/2, y = value, xend = x + width/2, yend = value))
    

    Created on 2023-05-12 with reprex v2.0.2