Search code examples
rdateaggregatesummarylubridate

Average Time Difference Between Two Dates PER Group


I have a dataframe with users and their visit dates. I'm trying to find the average time difference between visits per group. The output would be in days or fraction of a day

require(lubridate)
so <- data.frame(visit_dates = c("12/4/2016","12/6/2016","12/7/2016","12/3/2016","12/7/2016","12/10/2016"), person = c("1","1","1","2","2","2"))


so$visit_dates <- mdy(format(as.POSIXct(strptime(so$visit_dates,"%m/%d/%Y",tz = "")),format = "%m/%d/%Y"))

The output would look something like:

person    avgTimeBetweenVisit
1                 2.5
2                 3.5

Solution

  • Try data.table:

    require(lubridate)
    require(data.table)
    so <- data.frame(visit_dates = c("12/4/2016","12/6/2016","12/7/2016","12/3/2016","12/7/2016","12/10/2016"), person = c("1","1","1","2","2","2"))
    
    
    so$visit_dates <- mdy(format(as.POSIXct(strptime(so$visit_dates,"%m/%d/%Y",tz = "")),format = "%m/%d/%Y"))
    so <- data.table(so, key = c("person", "visit_dates"))
    res <- so[, .(avgTimeBetweenVisit = mean(diff(visit_dates))), by = person]
    print(res)
    # person avgTimeBetweenVisit
    # 1:      1            1.5 days
    # 2:      2            3.5 days