Search code examples
rggplot2plotgeom-bar

Add difference in % to a grouped bar chart


I have the following dataframe in r showing several attributes for some community districts (field CD) in two different years:

#Example data with one single attribute

x <- structure(list(numbldgs = c(195, 845, 3621, 3214, 10738, 793, 
223, 957, 4248, 3456, 11576, 803), Year = c("2007", "2007", "2007", 
"2007", "2007", "2007", "2018", "2018", "2018", "2018", "2018", 
"2018"), CD = c("103", "111", "210", "313", "414", "501", "103", 
"111", "210", "313", "414", "501")), row.names = c(NA, -12L), class = c("tbl_df", 
"tbl", "data.frame"))

I am plotting this data using the following code:

ggplot(x, aes(x=CD, y=numbldgs, fill = Year)) +
  geom_bar(stat="identity", width=.9, position = "dodge2") +
  labs( x="", y = "Number of buildings")+
  theme_classic() +
  theme(axis.text.x = element_text(angle=0, vjust=0.5, size=16), 
        axis.text.y = element_text(angle=0, vjust=0.5, size=16),
        legend.text=element_text(size=14), legend.position="bottom",
        legend.title = element_text(size=16),
        axis.title=element_text(size=12)) +
  scale_fill_manual(values=c('#F6D3B5','#D93B0A')) + 
  scale_y_continuous(labels = function(x) format(x, scientific = FALSE))

Which returns the following chart:

enter image description here

I am trying to add a label on top of the 2018 bar of each community district that shows the relative increment between the 2007 and the 2018 value, which would be expressed by the formula:

relative increment = ((value_in_2018 - value_in_2007)/(value_in_2007))*100

I am aiming to do this for several fields of the dataset. Hence, if a new field is generated to capture the % increment, I would need that to be done in several fields at once (e.g. number of buildings, but also number of people... etc). In addition to the value, the symbol "%" should be added - looking as follows:

enter image description here


Solution

  • You can use the tidyverse to calculate the percentage/increment value, you can add geom_text at the end:

    df2<-df %>% 
      arrange(CD) %>% 
      group_by(CD) %>% 
      mutate(rel_inc= numbldgs-lag(numbldgs, default=first(numbldgs)))
    
    #set 0 labels to NA
    df2[df2 == 0] <- NA
    
    ggplot(df2, aes(x=CD, y=numbldgs, fill = Year)) +
      geom_bar(stat="identity", width=.9, position = "dodge2") +
      labs( x="", y = "Number of buildings")+
      theme_classic() +
      theme(axis.text.x = element_text(angle=0, vjust=0.5, size=16), 
            axis.text.y = element_text(angle=0, vjust=0.5, size=16),
            legend.text=element_text(size=14), legend.position="bottom",
            legend.title = element_text(size=16),
            axis.title=element_text(size=12)) +
      scale_fill_manual(values=c('#F6D3B5','#D93B0A')) + 
      scale_y_continuous(labels = function(x) format(x, scientific = FALSE))+
      geom_text(aes(label=rel_inc), position=position_dodge(width=0.9), vjust=-0.25)