Search code examples
rdatedatetimegroup-bysum

Calculate Sum of Random observations as sum per week in R


I have a dataset of random, sometimes infrequent, events that I want to count as a sum per week. Due to the randomness they are not linear so other examples I have tried so far are not applicable.

The data is similar to this:


df_date <- data.frame( Name = c("Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim","Jim",
                                "Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue","Sue"),
                       Dates = c("2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
                                 "2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28",
                                 "2010-1-1", "2010-1-2", "2010-01-5","2010-01-17","2010-01-20",
                                 "2010-01-29","2010-02-6","2010-02-9","2010-02-16","2010-02-28"),
                       Event = c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1) )

What I'm trying to do is create a new table that contains the sum of events per week in the calendar year.

In this case producing something like this:

Name   Week   Events
Jim    1      3
Sue    1      3
Jim    2      0
Sue    x ...  x 

and so on...

Solution

  • Here is an approach that gets you each ISO week for each individual, with zeros when there are no events for that week for that individual:

    get_dates_df <- function(d) {
      data.frame(date = seq(min(d, na.rm=T),max(d,na.rm=T),1)) %>% 
        mutate(Year=year(date), Week=week(date)) %>% 
        distinct(Year, Week)    
    }
    
    df_date = df_date %>% mutate(Dates=lubridate::ymd(Dates))
    
    left_join(
      full_join(distinct(df_date %>% select(Name)), get_dates_df(df_date$Dates), by=character()),
      df_date %>% 
      group_by(Name,Year=year(Dates), Week=week(Dates)) %>% 
      summarize(Events = sum(Event), .groups="drop")
    ) %>% 
      mutate(Events=if_else(is.na(Events),0,Events))
    

    Output:

       Name Year Week Events
    1   Jim 2010    1      3
    2   Jim 2010    2      0
    3   Jim 2010    3      2
    4   Jim 2010    4      0
    5   Jim 2010    5      1
    6   Jim 2010    6      2
    7   Jim 2010    7      1
    8   Jim 2010    8      0
    9   Jim 2010    9      1
    10  Sue 2010    1      3
    11  Sue 2010    2      0
    12  Sue 2010    3      2
    13  Sue 2010    4      0
    14  Sue 2010    5      1
    15  Sue 2010    6      2
    16  Sue 2010    7      1
    17  Sue 2010    8      0
    18  Sue 2010    9      1