Search code examples
rfor-loopif-statementgroup

if statement for different id groups


I have 10000 id's, which all have a different number of purchase dates (with a minimum of 1). Per customer id, I want to find out if the number of days between the purchase dates exceeds 183 days (which is half a year). I only need a yes or no (binary) answer. The code I have for now is the following, however it does not work

for (id in 10000){
  for (i in length(purchase_date[id])){
    for (j in length(purchase_date[id])-1){
      if (as.numeric(purchase_date[i]-purchase_date[j])>=183){
        above_half_year <- 1
        }else{
          above_half_year <- 0 
        }
    }
  }
}

The top of the dataset is as follows

   ID  purchase_date 
1   1  2014-01-13 
2   1  2014-04-14 
3   1  2014-08-13 
4   1  2014-09-12 
5   1  2014-11-12 
6   1  2015-02-13 
7   1  2017-02-14 
8   1  2018-12-13 
9   1  2019-04-15
10  2  2016-03-01 
11  3  2016-06-13 
12  3  2016-09-20
13  3  2016-10-20 
14  3  2016-11-21 
15  3  2016-12-20 

The expected output would be

   ID  purchase_date above_half_year
1   1  2014-01-13    0
2   1  2014-04-14    0
3   1  2014-08-13    0
4   1  2014-09-12    0
5   1  2014-11-12    0
6   1  2015-02-13    0
7   1  2017-02-14    1
8   1  2018-12-13    1
9   1  2019-04-15    0
10  2  2016-03-01    0
11  3  2016-06-13    0
12  3  2016-09-20    0
13  3  2016-10-20    0
14  3  2016-11-21    0
15  3  2016-12-20    0

Thank you in advance for your help! I look forward to your suggestions


Solution

  • This is an approach with dplyr

    df %>%
      group_by(ID) %>%
      mutate(above_half_year = if_else(purchase_date - lag(purchase_date) > 183, 1, 0))