I have a data frame as below:
user time
____ ____
1 2017-09-01 00:01:01
1 2017-09-01 00:01:20
1 2017-09-01 00:03:01
1 2017-09-01 00:10:01
1 2017-09-01 00:11:01
2 2017-09-01 00:01:03
2 2017-09-01 00:01:08
2 2017-09-01 00:03:01
From this data frame I want to create the follow group for each user like below:
user time follow_group
____ ____________________ _____________
1 2017-09-01 00:01:01 1
1 2017-09-01 00:01:20 1
1 2017-09-01 00:03:01 1
1 2017-09-01 00:10:01 2
1 2017-09-01 00:11:01 2
2 2017-09-01 00:01:03 1
2 2017-09-01 00:01:08 1
2 2017-09-01 00:03:01 1
The follow group for each user changes when the time difference is greater than 5 minutes.
I tried by taking the lag and subtracting:
data[, previous_request_time:=c(NA, time[-.N]), by=user]
But that didn't seem to work. Any help is appreciated.
Just do a difftime
operation and check if the difference is greater than 5 mins. Then a cumulative sum will give your group counter:
dat[,
follow_group := cumsum(difftime(time, shift(time, fill=-Inf), units="mins") > 5),
by=user
]
# user time follow_group
#1: 1 2017-09-01 00:01:01 1
#2: 1 2017-09-01 00:01:20 1
#3: 1 2017-09-01 00:03:01 1
#4: 1 2017-09-01 00:10:01 2
#5: 1 2017-09-01 00:11:01 2
#6: 2 2017-09-01 00:01:03 1
#7: 2 2017-09-01 00:01:08 1
#8: 2 2017-09-01 00:03:01 1
You could similarly use just diff
if you didn't want to be too explicit about the units:
dat[, flwgrp := cumsum(c(Inf, diff(time)) > 5*60), by=user]