Search code examples
rdatetimetimeposixct

Exclude rows with certain time of day


I have a time series of continuous data measured at 10 minute intervals for a period of five months. For simplicity's sake, the data is available in two columns as follows:

Timestamp   Temp.Diff
2/14/2011 19:00 -0.385
2/14/2011 19:10 -0.535
2/14/2011 19:20 -0.484
2/14/2011 19:30 -0.409
2/14/2011 19:40 -0.385
2/14/2011 19:50 -0.215

... And it goes on for the next five months. I have parsed the Timestamp column using as.POSIXct.

I want to select rows with certain times of the day, (e.g. from 12 noon to 3 PM), I would like either like to exclude the other hours of the day, OR just extract those 3 hours but still have the data flow sequentially (i.e. in a time series).


Solution

  • You seem to know the basic idea, but are just missing the details. As you mentioned, we just transform the Timestamps into POSIX objects then subset.

    lubridate Solution

    The easiest way is probably with lubridate. First load the package:

    library(lubridate)
    

    Next convert the timestamp:

    ##*m*onth *d*ay *y*ear _ *h*our *m*inute
    d = mdy_hm(dd$Timestamp)
    

    Then we select what we want. In this case, I want any dates after 7:30pm (regardless of day):

    dd[hour(d) == 19 & minute(d) > 30 | hour(d) >= 20,]
    

    Base R solution

    First create an upper limit:

    lower = strptime("2/14/2011 19:30","%m/%d/%Y %H:%M")
    

    Next transform the Timestamps in POSIX objects:

    d = strptime(dd$Timestamp, "%m/%d/%Y %H:%M")
    

    Finally, a bit of dataframe subsetting:

    dd[format(d,"%H:%M") > format(lower,"%H:%M"),]
    

    Thanks to plannapus for this last part


    Data for the above example:

    dd = read.table(textConnection('Timestamp Temp.Diff
    "2/14/2011 19:00" -0.385
    "2/14/2011 19:10" -0.535
    "2/14/2011 19:20" -0.484
    "2/14/2011 19:30" -0.409
    "2/14/2011 19:40" -0.385
    "2/14/2011 19:50" -0.215'), header=TRUE)