I have a data_frame
with POSIXct
date-times. I would now like to create a variable that cuts these date-times into timebands: 1 -- [00:00:00, 08:00:00), 2 -- [08:00:00, 17:00:00), 3 -- [17:00:00, 18:30:00), 4 -- [18:30:00, 00:00:00).
Here is some sample data:
df_times = data_frame(
datetime = seq.POSIXt(
from = as.POSIXct(strftime("2016-01-01 00:00:00", format = "%Y-%m-%d :%H:%M:%S")),
by = "min",
length.out = 100000
),
value = rnorm(100000)
)
Here is the expected output:
> df_times
# A tibble: 100,000 × 3
datetime value band
<dttm> <dbl> <dbl>
1 2016-01-01 00:00:00 0.5855288 1
2 2016-01-01 00:01:00 0.7094660 1
3 2016-01-01 00:02:00 -0.1093033 1
4 2016-01-01 00:03:00 -0.4534972 1
5 2016-01-01 00:04:00 0.6058875 1
6 2016-01-01 00:05:00 -1.8179560 1
7 2016-01-01 00:06:00 0.6300986 1
8 2016-01-01 00:07:00 -0.2761841 1
9 2016-01-01 00:08:00 -0.2841597 1
10 2016-01-01 00:09:00 -0.9193220 1
# ... with 99,990 more rows
I have tried cut.POSIXt
but that insists on keeping track of dates. An ideal solution will use dplyr::recode
or forcats::
.
Here is the solution I think directly translates the intent of the question into code:
set.seed(12345)
# create a dataset
df_times = data_frame(
datetime = seq.POSIXt(
from = as.POSIXct("2016-01-01 00:00:00", format = "%Y-%m-%d %H:%M:%S"),
by = "min",
length.out = 100000
),
value = rnorm(100000)
) %>%
mutate(
time = times(format(datetime, "%H:%M:%S")),
cut(
time,
breaks = times(c(
"00:00:00",
"08:00:00",
"17:00:00",
"18:30:00",
"23:59:59"
)),
labels = c(
"1",
"2",
"3",
"4"
),
include.lowest = TRUE,
right = FALSE
)
)