Search code examples
rdateggplot2bar-chartcumsum

Cumulative sum across time


enter image description hereI have a dataset that I want to summarize through time. I have a period of ten dates and flower counts on three plants (Tomato, Pepper, Squash). I would like to create a ggplot (barplot) plot that sums the number of flowers and displays them as a stacked bar plot colored by plant. The Y axis should be the cumulative sum of flowers and the x axis should be time. When I use cum_sum the output does not make sense to me. Any help would be great! Thanks.

dataset here

    df.sum<- df.sub%>% group_by(Date) %>% mutate(cumsum_covered = cumsum(Tomato)) 

ggplot (df.sum, aes (x=Date, y=cumsum_covered)) + geom_bar(stat="identity")

Solution

  • You are grouping by date so the cumsum will always be the single value. We want to get the cumsum of each fruit ordered by date

    df.sum <- df.sub %>% 
      # This gives us Date, fruit, amount
      gather(fruit, amount, Tomato, Pepper, Squash) %>%
      # We group by the fruit to get only the cumsums for the correct fruit and order by date
      group_by(fruit) %>% 
      arrange(Date) %>%
      mutate(cumsum_covered = cumsum(amount))
    
    ggplot(df.sum, aes(Date, cumsum_covered, fill=fruit)) + 
      geom_col(position="stack")