Search code examples
rggplot2geom-bar

Geom_bar with position = 'fill', hlines for means


I am trying to plot a graph that shows the relative percentage of granted loans for two customer groups (1-5 and 6-8) each month. This is how I did it:

df <- data.frame(time=rep(seq.Date(as.Date('2015-01-01'),as.Date('2018-01-01'), by='month'),2),
                 key = c(rep('1-5',37),rep('6-8',37)), value = c(round(rnorm(37,400,20)),round(rnorm(23,100,10)),
                                                                 round(rnorm(14,250,10))))


ggplot(df,aes(x=time,y=value,fill=key))+
  geom_bar(stat = "identity",position = "fill")+
  geom_vline(xintercept = as.numeric(as.Date('2016-12-01')), size=1)

The result

What I would like is to include the mean percentage of the 6-8 group before and after 2017, something like this.


Solution

  • You want to pre-calculate the averages for before and after the key date and then add them to the plot. Something like this:

    library(ggplot2)
    library(dplyr)
    library(tidyr)
    
    df <-
      data.frame(
        time = rep(seq.Date(
          as.Date('2015-01-01'), as.Date('2018-01-01'), by = 'month'
        ), 2),
        key = c(rep('1-5', 37), rep('6-8', 37)),
        value = c(round(rnorm(37, 400, 20)), round(rnorm(23, 100, 10)),
                  round(rnorm(14, 250, 10)))
      )
    
    # calculate the percents
    (
      dd <- df %>% 
        spread(key, value) %>% 
        mutate(f15=`1-5`/(`1-5`+`6-8`)) %>% 
        mutate(f68=1-f15)
    )
    
    # get averages for before and after 2016-12-01
    (
      mnp <- dd %>% 
        mutate(ba=ifelse(time > as.Date('2016-12-01'), "after", "before")) %>% 
        group_by(ba) %>% 
        mutate(mnp=mean(f68))
    )
    
    # add to plot  
    ggplot(df, aes(x = time, y = value, fill = key)) +
      geom_bar(stat = "identity", position = "fill") +
      geom_vline(xintercept = as.numeric(as.Date('2016-12-01')), size = 1) +
      geom_point(data=mnp, aes(x=time, y=mnp), pch="-", size=5, inherit.aes = FALSE, color="blue")
    

    Should make this plot:

    enter image description here