I am looking for a data.table solution to the following question.
Suppose I have the following dataset:
library(data.table)
library(lubridate)
DT <- data.table(date = c("2019-01-01 00:03:04",
"2019-01-01 00:07:03",
"2019-01-01 00:15:23",
"2019-01-01 00:16:28",
"2019-01-01 00:21:30"))
DT
date
1: 2019-01-01 00:03:04
2: 2019-01-01 00:07:03
3: 2019-01-01 00:15:23
4: 2019-01-01 00:16:28
5: 2019-01-01 00:21:30
I am looking to map each date into 15 minute interval for each day. So the final dataset would look like:
date date_15min_grp
1: 2019-01-01 00:03:04 1
2: 2019-01-01 00:07:03 1
3: 2019-01-01 00:15:23 2
4: 2019-01-01 00:16:28 2
5: 2019-01-01 00:21:30 2
The groups would be unique at the day level, so 2019-01-01 00:03:04
and 2019-01-02 00:03:04
would both return values of 1. My current solution is to extract the hour/minute/second component of each day. I would then add the values (after conversion into minutes), divide them by 60, and try to map each value to the nearest "group". But this seems to be slow and inelegant.
A data.table/lubridate would be greatly appreciated.
Thanks so much!
Please find below a solution using data.table
and lubridate
Reprex
library(data.table)
library(lubridate)
DT[, date_15min_grp := fcase(minute(date) < 15, 1,
minute(date) < 30, 2,
minute(date) < 45, 3,
default = 4)][]
#> date date_15min_grp
#> 1: 2019-01-01 00:03:04 1
#> 2: 2019-01-01 00:07:03 1
#> 3: 2019-01-01 00:15:23 2
#> 4: 2019-01-01 00:16:28 2
#> 5: 2019-01-01 00:21:30 2
Created on 2021-11-30 by the reprex package (v2.0.1)
AS A FOLLOW-UP TO YOUR COMMENT
library(data.table)
library(lubridate)
DT[, date_15min_grp := fcase(minute(date) < 15, hour(date)*4 + 1,
minute(date) < 30, hour(date)*4 + 2,
minute(date) < 45, hour(date)*4 + 3,
minute(date) < 60, hour(date)*4 + 4)][]
#> date date_15min_grp
#> 1: 2019-01-01 00:03:04 1
#> 2: 2019-01-01 00:07:03 1
#> 3: 2019-01-01 00:15:23 2
#> 4: 2019-01-01 00:16:28 2
#> 5: 2019-01-01 00:21:30 2
Created on 2021-12-01 by the reprex package (v2.0.1)