I have the following data:
df <- data.frame(dt1 = c("2019-05-02", "2019-01-02", "2019-06-02"),
dt2 = c("2019-08-30", "2019-04-24", "2019-12-06") )
> df
dt1 dt2
1 2019-05-02 2019-08-30
2 2019-01-02 2019-04-24
3 2019-06-02 2019-12-06
Here is what I want to do:
i) I want create factors by binning, for example, for the first date, the dates binned as 2019-07-31, 2019-06-30, 2019-05-31, so essentially binning by dt2.
ii) I want to count the total number of dates in each bin.
The expected output is:
dt1 dt2 val_count
1 2019-05-02 2019-08-30 3
2 2019-01-02 2019-04-24 3
3 2019-06-02 2019-12-06 6
I found this post relevant.
Note: I do not want take difference between months of two dates.
Thank you for suggestions.
It's pretty messy but if you want to count how many last date of the months are in between dt1
and dt2
, you may try
library(lubridate)
library(dplyr)
fd <- paste0(lubridate::year(min(df$dt1, df$dt2)), "-02-01") %>% as.Date()
ld <- paste0(lubridate::year(max(df$dt1, df$dt2))+1, "-01-01") %>% as.Date()
x <- seq.Date(fd, ld, by = "month") - 1
df %>%
rowwise() %>%
mutate(val_count = length(x[dt1 < x & x < dt2]))
dt1 dt2 val_count
<chr> <chr> <int>
1 2019-05-02 2019-08-30 3
2 2019-01-02 2019-04-24 3
3 2019-06-02 2019-12-06 6
Choice of <
or <=
depends on your purpose.
To get total days between dt1
and dt2
,
df %>%
rowwise() %>%
mutate(val_count = length(x[dt1 < x & x < dt2])) %>%
mutate(dd = as.Date(dt2) - as.Date(dt1))
dt1 dt2 val_count dd
<chr> <chr> <int> <drtn>
1 2019-05-02 2019-08-30 3 120 days
2 2019-01-02 2019-04-24 3 112 days
3 2019-06-02 2019-12-06 6 187 days
df %>%
rowwise() %>%
mutate(val_count = length(x[dt1 < x & x < dt2]),
val_count = ifelse(val_count == 0, 1, val_count)) %>%
mutate(dd = as.Date(dt2) - as.Date(dt1))
dt1 dt2 val_count dd
<chr> <chr> <dbl> <drtn>
1 2019-05-02 2019-08-30 3 120 days
2 2019-01-02 2019-04-24 3 112 days
3 2019-06-02 2019-12-06 6 187 days
4 2019-06-01 2019-06-02 1 1 days