Search code examples
rsubsetstringrposixlt

Subset data based on pentad dates with leap year


I'm trying to subset the following data by pentad dates. Pentad means non overlapping 5 day average. For leap years, Pentad 12 includes February 29 (6 days average instead of 5):

Link to Data

Link to pentad dates

Here's my code:

library(stringr)
dat      <- read.csv("tc_filt_1981-2007.csv",header = T,sep = ",")

dat$Date = paste(dat$Year, str_pad(dat$Month,2,'left','0'),    str_pad(dat$Day,2,'left','0'), sep='-')
dat$yday = as.POSIXlt(dat$Date)$yday + 1
dat$pentad = ceiling(dat$yday/5)
df<-split(dat, dat$pentad)

Problem:

The dat$y line only works for 365 days. In a given year, there should be only 73 pentads. My code above produce 74 pentads when I check the dat$pentad. The df contains the data frames for each pentad.

I did the following for checking:

test<-dat[which(dat$pentad == 74),]

Output:

SN     CY Year Month Day  Hour  Lat Lon   Cat   Date yday pentad
200034 34 2000    12  31    0 12.7 128.2  TS 2000-12-31  366     74
200034 34 2000    12  31    6 13.3 128.8  TS 2000-12-31  366     74
200034 34 2000    12  31   12 13.9 129.7  TS 2000-12-31  366     74
200034 34 2000    12  31   18 14.4 130.6  TS 2000-12-31  366     74

Question:

  1. How do I account for the leap year in my code?

Can anyone suggest how can I do this?

Many thanks,


Solution

  • Minor adjustment:

    library(lubridate)
    dat$pentad = ceiling( (dat$yday - leap_year(dat$Year)*(dat$yday > 59)) / 5 )