Search code examples
rggplot2histogramfacet

plot histogram of every consecutive n years in r


I have few years of daily rainfall data, like:

date         value
01/01/1990    1.02
02/01/1990    0.50
03/01/1990    0.00
.........     ...
.........     ...
12/12/2015    10.25

from which I need to make every consecutive five year histogram plot. ie, histogram of 1990 to 1995, then 1991 to 1996 and so on.. I tried using ggplot and facet, could not find a way.

rf_facet <- inp %>%
  filter(between(rain,1,100))

ggplot(rf_facet, aes(x = rain)) + facet_wrap(~year, nrow = 5) +
  geom_histogram(aes(fill =..count..))

This can only produce plot for single year, I am looking for every consecutive five years.

Any help would be appreciated. An example data is here


Solution

  • Here's an example that uses ggplot2 and cowplot. I have a function that plots from year i to year i+5. I run this for all possible consecutive 5 year periods using lapply.

    # Dummy data
    df <- data.frame(date = seq(as.Date('01/01/1990', format = "%d/%m/%Y"), 
                          as.Date('31/12/2000', , format = "%d/%m/%Y"), by="day"))
    df$value <- runif(nrow(df), 0, 100)
    
    # Load libraries
    library(dplyr)
    library(cowplot)
    library(ggplot2)
    library(lubridate)
    
    # Plotting function
    plot_rain <- function(i){
      g <- ggplot(df %>% filter(between(year(date), i, i+5)))
      g <- g + geom_histogram(aes(value))
      g <- g + xlab("Rainfall (mm)") + ylab("# of obs")
      g <- g + ggtitle(paste(i, i+5, sep = "-"))
    }
    
    # Run for all years
    plist <- lapply(min(year(df$date)):(max(year(df$date))-5), plot_rain)
    
    # USe cowplot to plot the list of figure
    plot_grid(plotlist = plist, ncol = 2)
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    #> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.
    

    Created on 2019-03-07 by the reprex package (v0.2.1)