I have an app record and want to calculate the time between two specific events.
My record structure looks like this:
appdata <- data.frame(userid = c(1,1,1,1,1), dayid = c(32,32,32,32,32), activity = c("appstart","levelup","appclose","appstart","appclose"), datesec = c(2670,2726,2755,2787,4161))
appdata
userid dayid activity datesec
1 1 32 appstart 2670
2 1 32 levelup 2726
3 1 32 appclose 2755
4 1 32 appstart 2787
5 1 32 appclose 4161
I want to know for one day how long the user was active. So I have to calculate the differences between each appstart and appclose and then build the sum, so here: (2755-2670) + (4161-2755) = 1459.
The new dataset should look like this:
appdata2 <- data.frame(user = c(1), dayid = c(32), usagetime_in_sec = c(1491))
appdata2
user dayid usagetime_in_sec
1 1 32 1459
Here is my basic approach, but I don't know how to tell R to always calculate the difference between an appstart and the next appclose event:
apdata2 <- appdata %>%
group_by(userid, dayid) %>%
summarise(usagetime_in_sec = sum(datsec(type == "appclose") - datesec(type == "appstart")))
You were very close. I think you need something like
library(dplyr)
appdata %>%
group_by(userid, dayid) %>%
summarise(usagetime_in_sec = sum(datesec[activity == "appclose"] -
datesec[activity == "appstart"]))
# userid dayid usagetime_in_sec
# <dbl> <dbl> <dbl>
#1 1 32 1459
However, make sure you have equal number of "appclose" and "appstart" activity
otherwise it might mess up the calculation.