In the following example, I'd like to calculate the variables Date_replaced
and Date_current_until
from Date
for each customer and product.
Date_replaced
reflects the timepoint at which a product bought by a certain customer was replaced by the purchase of another sort of product.
Date_current_until
reflects the timepoint until a product was the current product for each customer.
library(dplyr)
Customer_ID <- c(rep(1,4), rep(2,5))
Date <- c(seq(as.Date("2024-02-01"), as.Date("2024-02-04"), "days"), seq(as.Date("2024-02-01"), as.Date("2024-02-05"), "days"))
Goods <- c(rep("Food",3), "Toys", "Newspaper", rep("Food",3), "Toys")
Date_first_purchased <- c(rep("2024-02-01", 3), "2024-02-04", "2024-02-01", rep("2024-02-02", 3), "2024-02-05") %>% as.Date (., format="%Y-%m-%d")
Date_replaced <- c(rep("2024-02-04", 3), NA, "2024-02-02", rep("2024-02-05", 3), NA) %>% as.Date (., format="%Y-%m-%d")
Date_current_until <- c(rep("2024-02-04", 4), "2024-02-02", rep("2024-02-05", 4)) %>% as.Date (., format="%Y-%m-%d")
df <- data.frame(Customer_ID, Date, Goods, Date_first_purchased, Date_replaced, Date_current_until)
So far, I've tried the usual things when working on time series data, for instance, grouping data by ID
and arranging them by Date
within each ID
. I also tried using lead ()
in combination with mutate
but was not successful as it seems that lag()
and lead()
only support fixed numbers to find a previous or subsequent entry. However, this problem obviously requires a dynamic solution which I was not able to find yet. Maybe someone of you has tackled a similar problem previously. Any advice on how to do this is appreciated.
library(tidyverse)
df %>%
mutate(Date_replaced = if_else(lead(Goods) != Goods,
lead(Date),
max(Date[Goods == Goods])),
Date_current_until = if_else(row_number() == n(),
Date,
if_else(lead(Goods) != Goods,
lead(Date),
max(Date[Goods == Goods]))),
.by = Customer_ID)
# A tibble: 9 × 6
Customer_ID Date Goods Date_first_purchased Date_replaced Date_current_until
<dbl> <date> <chr> <date> <date> <date>
1 1 2024-02-01 Food 2024-02-01 2024-02-04 2024-02-04
2 1 2024-02-02 Food 2024-02-01 2024-02-04 2024-02-04
3 1 2024-02-03 Food 2024-02-01 2024-02-04 2024-02-04
4 1 2024-02-04 Toys 2024-02-04 NA 2024-02-04
5 2 2024-02-01 Newspaper 2024-02-01 2024-02-02 2024-02-02
6 2 2024-02-02 Food 2024-02-02 2024-02-05 2024-02-05
7 2 2024-02-03 Food 2024-02-02 2024-02-05 2024-02-05
8 2 2024-02-04 Food 2024-02-02 2024-02-05 2024-02-05
9 2 2024-02-05 Toys 2024-02-05 NA 2024-02-05