Search code examples
rggplot2moving-average

How to calculate a moving average


I came across this article in The New York Times today about coronavirus and I liked how the graphs were presented. I know the bar plots can is just using geom_col() in ggplot but I am more interested in the smoothing part. Just like this graph:

enter image description here

They said that "each red line is the seven-day moving average, which smooths out day-to-day anomalies..." How do you do that? I have a dataset that I plan to present it in a similar way.

Thanks!


Solution

  • data.table also has a rolling mean function, frollmean, which can be used for this purpose:

    library(data.table)
    library(ggplot2)
    library(scales)
    
    # create some data
    set.seed(1)
    DT <- data.table(N = rescale(dnorm(seq(-10, 10, by=.1)) + 
            runif(201, -.1, .1), c(1, 800)))
    
    # apply rolling mean over 10 data points
    DT[, `:=`(rollN = frollmean(N, n = 10, align = "center"), idx = .I)]
    
    ggplot(DT, aes(x=idx, y=N)) + 
        theme_bw() + 
        geom_line() + # original data
        geom_line(data=DT, aes(x=idx, y=rollN), colour = "red", size = 2) +  # rolling mean
        geom_histogram(aes(x=idx, weight = N/10), binwidth = 10, inherit.aes = FALSE, fill="red", alpha = .2) # histogram
    #> Warning: Removed 9 row(s) containing missing values (geom_path).
    

    Created on 2020-03-19 by the reprex package (v0.3.0)