Search code examples
rggplot2chartsstacked

ggplot bars colored conditionally


I have a similar problem to this one Conditionally color bars in ggplot

I want to have the yes/no stacked as bars. For example (0.01,0.09] should have stacked bars(colored in two colors) since it has both yes and no, just as the question in this link.
When I try to add 'local manager' as a legend, y-axis values are way off.

This is what I have(from my data), and what I think is suppose to work:

ggplot(df,aes(x=bins, y=factor(sum_by_bin), fill=Local.Manager))+  geom_col(stat = 'identity')  
+ labs(y = "sum, x = "bins") 

As you can see, the y-axis values don't match the values in the df. [my plot]

I can't figure out why that happens. Any help would be appreciated!

Data

In this data set, Local.Manager is named local_manager.

df <-
structure(list(local_manager = c("yes", "no", "yes", "yes", "no", 
"no", "no", "yes", "yes"), bins = c("(0.01,0.09]", "(0.01,0.09]", 
"(0.01,0.09]", "(0.01,0.09]", "(0.89,0.99]", "(0.89,0.99]", "(0.99,1]", 
"(0.69,0.79]", "(0.69,0.79]"), sum_by_bin = c(109L, 109L, 109L, 
109L, 56L, 56L, 45L, 33L, 33L)), class = "data.frame", row.names = c(NA,-9L))

Solution

  • The reason the values "don't match", is because the stack adds on each other rather than split it's value by the number of item in that stack.

    So to get the result you want, you'll want to do the splitting of the value yourself using that group_by(bins) from the other answer :

    df |>
      # Group by bins
      group_by(bins) |>
      # Divide sum_by_bin by the number of yes/no for that sum_by_bin so the stack has the value of sum_by_bin
      mutate(n = length(Local.Manager),
             y = sum_by_bin/n) |>
      # ggplot
      ggplot(aes(x = bins, 
                 y = y,
                 fill = Local.Manager)) +
      # columns plot
      geom_col()
    

    plot with stacked bars of yes/no