Search code examples
rggplot2positiongeom-bargeom-text

stacked barplot where each stack is scaled to sum to 100% + geom_text(), ggplot geom_bar


I would like to extend the question posted here: Create stacked barplot where each stack is scaled to sum to 100%, of a scattered barplot with an argument of geom_bar set to position = "fill".

library(ggplot2)
library(dplyr)
library(tidyr)

dat <- read.table(text = "    ONE TWO THREE
1   23  234 324
2   34  534 12
3   56  324 124
4   34  234 124
5   123 534 654",sep = "",header = TRUE)

# Add an id variable for the filled regions and reshape
datm <- dat %>% 
  mutate(ind = factor(row_number())) %>%  
  gather(variable, value, -ind)

ggplot(datm, aes(x = variable, y = value, fill = ind)) + 
    geom_bar(position = "fill",stat = "identity") +
    # or:
    # geom_bar(position = position_fill(), stat = "identity") 
    scale_y_continuous(labels = scales::percent_format())

example figure

Imagine you would like to add segment labels. It works for me if I leave out position = "fill", (even though it messes up the y-axis scale).

ggplot(datm, aes(x = variable, y = value, fill = ind)) + 
    geom_bar(stat = "identity") +
    # or:
    # geom_bar(position = position_fill(), stat = "identity") 
    scale_y_continuous(labels = scales::percent_format())+
    geom_text(label= 'bla')

but, if I add position = "fill", the plot is messed up. The bars disappear as they are marginalized to for scale of up to 100, and the labels appear detached from bars in the grey area.

ggplot(datm, aes(x = variable, y = value, fill = ind)) + 
    geom_bar(position = "fill", stat = "identity") +
    # or:
    # geom_bar(position = position_fill(), stat = "identity") 
    scale_y_continuous(labels = scales::percent_format())+
    geom_text(label= 'bla') 

enter image description here

Why? How to? Thx!


Solution

  • You'd need to apply the fill position adjustment on the text layer too. You can control where the text will appear relative to the bounds by adjusting the vjust parameter.

    library(ggplot2)
    library(dplyr)
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    library(tidyr)
    
    dat <- structure(list(ONE = c(23L, 34L, 56L, 34L, 123L), 
                          TWO = c(234L, 534L, 324L, 234L, 534L), 
                          THREE = c(324L, 12L, 124L, 124L, 654L)), 
                     class = "data.frame", 
                     row.names = c("1", "2", "3", "4", "5"))
    
    # Add an id variable for the filled regions and reshape
    datm <- dat %>% 
      mutate(ind = factor(row_number())) %>%  
      gather(variable, value, -ind)
    
    ggplot(datm, aes(x = variable, y = value, fill = ind)) + 
      geom_col(position = "fill") +
      scale_y_continuous(labels = scales::percent_format())+
      geom_text(label= 'bla', position = position_fill(vjust = 0.5)) 
    

    Created on 2021-01-19 by the reprex package (v0.3.0)

    I'm only registering this as an answer so people browsing for unanswered questions don't stumble upon this one. It is way less effort for me to suggest a 1 line fix in the comments.