Search code examples
rtimegroupinglubridateperiod

How to define periods of time in hours when I have merged dates and time in R?


I have obtained a data set of several columns with different type of data from an experiment and I needed to start the counting exactly at a specific date and grouping data every 24h.

The drawback is that I have the date and the time together (i.e., "2019-04-06 07:45:00"). I've tried to calculate the differences between cells, but what I obtain is the difference between dates, and times between 00:00:00 and 07:45:00 are included in the wrong day.

Is there a way to calculate the 24h periods?

I also tried to convert the cells's type with lubridate package, unsuccessfully. Because I do not know how to set the first row into the starting point of the 24h range period.

Thank you in advance.


Solution

  • If I understand you correctly, you wish to include any time before 07:45 am with the previous day's data. If that is the case, just subtract 7 hours and 45 minutes from each date time and find its date.

    Suppose we have data like this:

    df <- data.frame(Time  = as.POSIXct(c("2020-08-20 07:45:00", 
                                          "2020-08-20 08:45:00",
                                          "2020-08-20 09:45:00",
                                          "2020-08-20 15:45:00",
                                          "2020-08-20 21:45:00",
                                          "2020-08-21 05:45:00",
                                          "2020-08-21 07:45:00",
                                          "2020-08-21 21:45:00",
                                          "2020-08-22 06:45:00",
                                          "2020-08-22 08:45:00")), Value = 1:10)
    
    df
    #>                   Time Value
    #> 1  2020-08-20 07:45:00     1
    #> 2  2020-08-20 08:45:00     2
    #> 3  2020-08-20 09:45:00     3
    #> 4  2020-08-20 15:45:00     4
    #> 5  2020-08-20 21:45:00     5
    #> 6  2020-08-21 05:45:00     6
    #> 7  2020-08-21 07:45:00     7
    #> 8  2020-08-21 21:45:00     8
    #> 9  2020-08-22 06:45:00     9
    #> 10 2020-08-22 08:45:00    10
    

    Then we can just do:

    library(lubridate)
    df$Date <- date(df$Time - hours(7) - minutes(45))
    

    And we will have our data grouped appropriately:

    df
    #>                   Time Value       Date
    #> 1  2020-08-20 07:45:00     1 2020-08-20
    #> 2  2020-08-20 08:45:00     2 2020-08-20
    #> 3  2020-08-20 09:45:00     3 2020-08-20
    #> 4  2020-08-20 15:45:00     4 2020-08-20
    #> 5  2020-08-20 21:45:00     5 2020-08-20
    #> 6  2020-08-21 05:45:00     6 2020-08-20
    #> 7  2020-08-21 07:45:00     7 2020-08-21
    #> 8  2020-08-21 21:45:00     8 2020-08-21
    #> 9  2020-08-22 06:45:00     9 2020-08-21
    #> 10 2020-08-22 08:45:00    10 2020-08-22
    

    Created on 2020-08-20 by the reprex package (v0.3.0)