Search code examples
rggplot2histogramjitter

How to overlay means and error bars with jitter dots and smooth distribution with ggplot2 in R?


In order to get a complete picture of datasets, one solution is to show the means along with some error bars around the means but also jittered points of the individual scores, and finally, a smoothed distibution of those scores. An example is enter image description here taken from Yang, B. W., et al. (2021).

How can we superimpose dots, error bars, jitter dots and histogram all in the same plot, with a small gutter between each?

For the purpose of the illustration, lets suppose that the data are

x1=c(2.0,2.1,2.5,2.7,2.8,3.1)
x2=c(2.5,2.9,3.0,3.2,3.3,3.9)
x=data.frame(cbind(x1,x2))

and that the statistics used to draw the points and the error bars are

group = c(1, 2)
centr = c(2.53, 3.13) 
width = c(0.50, 0.50) 
stats = data.frame( cbind(group, centr, centr-width, centr+width ) )

I managed to make the plot with the points and error bars with

ggplot( stats ) +
    geom_point( aes(x=group, y=centr, size = 1) ) +
    geom_errorbar(stat="identity", position=position_dodge(.9), aes( x=group, ymin=V3, ymax=V4), width=0.1 ) +
    scale_y_continuous("mean ratings") 

and the jitter dots with

ggplot( x ) +
    geom_jitter( aes( y= x1, x = 1, col=1), width=0.15 ) + 
    geom_jitter( aes( y= x2, x = 2, col=2), width=0.15 )

but I have no clue with regards to the smoothed distributions.

Further, if I wish the two groups of data to be separated (the first group's point, error bar, jitter dots and histogram on the left, say, and the second group's point, error bar, jitter dots and histogram to the right), what changes would be required?


Solution

  • Basically you could achieve your desired result like so:

    1. Convert your x dataset to long format
    2. To add the densities switch x and y and make use of coord_flip instead
    3. To position the errorbars and jitter points set y=-2/-1
    4. To get your desired plot where the groups of data are separated you could facet by group but remove the panel.spacing and the strip.text
    x1 <- c(2.0, 2.1, 2.5, 2.7, 2.8, 3.1)
    x2 <- c(2.5, 2.9, 3.0, 3.2, 3.3, 3.9)
    x <- data.frame(cbind(x1, x2))
    
    x_long <- tidyr::pivot_longer(x, everything(), names_prefix = "x", names_to = "group")
    x_long$group <- as.integer(x_long$group)
    
    group <- c(1, 2)
    centr <- c(2.53, 3.13)
    width <- c(0.50, 0.50)
    stats <- data.frame(cbind(group, centr, centr - width, centr + width))
    
    library(ggplot2)
    
    ggplot(stats, aes(color = factor(group))) +
      geom_point(aes(y = -2, x = centr), size = 1) +
      geom_errorbar(stat = "identity", aes(y = -2, xmin = V3, xmax = V4), width = 0.1) +
      geom_jitter(data = x_long, aes(x = value, y = -1), width = 0.1) +
      geom_density(data = x_long, aes(x = value, fill = factor(group), group = group), alpha = .7) +
      scale_x_continuous("mean ratings") +
      scale_y_continuous(expand = c(0, .2)) +
      coord_flip() +
      facet_wrap(~group) +
      theme(axis.text.x = element_blank(), axis.title.x = element_blank(), axis.ticks.x = element_blank(),
            panel.spacing.x = unit(0, "pt"),
            strip.text.x = element_blank()) +
      labs(color = NULL, fill = NULL)