Search code examples
rggplot2geom-text

Plotting count in geom_text after a geom_bar when proportions are unbalanced with position="fill


I want to plot the following mock.data with a plot similar as the following:

# mock data    
mock.data <- data.frame(
      patient=c(rep("Healthy", 9925), rep("Sick", 75)),
      drug=c(rep(c("A", "B"), 5000)))

# plot code
    ggplot(mock.data, aes(y=patient, fill=drug))+
      geom_bar(position="fill")+
      geom_text(stat = "count", aes(label = after_stat(count), 
                                    x = after_stat(count)/sum(after_stat(count)), 
                                    group=drug),
                position=position_stack(0.5))+
 xlab("proportion")

With the following output: enter image description here

The problem is that when adding a geom_text with after_stat("count"), with my code, the labels for the sick group locate on the x axis according to their proportion respect to the four groups, not only sick or healthy. The particularity in this case is the huge disproportion between the groups, which pushed the sick values to 0.

All the solutions I've found to solve this precompute the proportion by group separately before plotting but I think this is not optimal.

Therefore, I wonder if there is any possible solution, for example modifying somehow the data directly inside the ggplot using its own parameters.


Solution

  • If the bars are positioned with "fill", which takes the computed values and scales them to the 0-1 range, we should use the same for the geom_text layer. position = "fill" is equivalent to position = position_fill(), but if we use the latter we can specify further parameters, like in this case vjust specifies how far along in each bar the text should be placed.

    mock.data <- data.frame(
      patient=c(rep("Healthy", 9925), rep("Sick", 75)),
      drug=c(sample(c("A", "B"), replace = TRUE, size = 5000, prob = c(0.2, 0.8))))
    
    # plot code
    ggplot(mock.data, aes(y=patient, fill=drug))+
      geom_bar(position="fill")+
      geom_text(stat = "count", aes(label = after_stat(count), 
                                    x = after_stat(count)/sum(after_stat(count)), 
                                    group=drug),
                position=position_fill(vjust = 0.5))+
      xlab("proportion")
    

    enter image description here