Search code examples
rggplot2plothistogramdensity-plot

combine ggplots from dataframes with different lengths


I have two ggplots from data frames of different lengths for which I have plotted the histogram of one column separately like below. I want to combine these two plots into one ggplot with two different y axes, one on the right and one on the left for the two data frames. How can I do this?

a = ggplot(GG, aes(x = as.numeric(Kstat))) +
  theme_pubclean()

a + geom_density() +
  geom_vline(aes(xintercept = mean(Kstat)), 
             linetype = "dashed", size = 0.6) + xlim(0,1000)+ylim(0,0.1)

b = ggplot(all, aes(x = as.numeric(Kstat))) +
  theme_pubclean()

b + geom_density() +
  geom_vline(aes(xintercept = mean(Kstat)), 
             linetype = "dashed", size = 0.6) + xlim(0,1000)+ylim(0,0.5)

Solution

  • We don't have your data, so here's an example with a dataset included in ggplot2:

    library(ggplot2)
    df1 <- diamonds[1:10,7]
    df2 <- diamonds[100:2100,7]
    

    For this example, the data in df1 is much less varied and so the density spike is ~25x higher.

    ggplot() +
      geom_density(data = df1, aes(x = price)) +
      geom_vline(data = df1, aes(xintercept = mean(price)), 
                 linetype = "dashed", size = 0.6) +
      geom_density(data = df2, aes(x = price), color = "red") +
      geom_vline(data = df2, aes(xintercept = mean(price)), 
                 linetype = "dashed", color = "red", size = 0.6) 
    

    enter image description here

    One way to deal with this would be to scale the df2 density up 25x and to create a secondary axis with the inverse adjustment. (This is how secondary axes work in ggplot2; you first scale the data into the primary axis, and then create a secondary axis as an annotation that helps the reader interpret it.)

    ggplot() +
      geom_density(data = df1, aes(x = price)) +
      geom_vline(data = df1, aes(xintercept = mean(price)), 
                 linetype = "dashed", size = 0.6) +
      geom_density(data = df2, aes(x = price, y = ..density.. * 25), color = "red") +
      geom_vline(data = df2, aes(xintercept = mean(price)), 
                 linetype = "dashed", color = "red", size = 0.6) +
      scale_y_continuous(sec.axis = ~ . / 25) +
      theme(axis.text.y.right = element_text(color = "red"))
    

    enter image description here