Search code examples
rrecursiontidyversedplyrtidy

Create recursive variables in tidy


I'm trying to make a tidy solution to create an duration variable based on whether some event has happened or not. This is very simple to do in a for loop as

    library(tidyverse)
    df <- tibble(event = c(0,0,0,0,1,0,0,1,1,0,0,1))
    df$dur <- NA
    df$dur[1]<-0 
    for(i in 2:nrow(df)){
    if(df$event[i]==0){
    df$dur[i] <- df$dur[i-1]+1
    }else{
    df$dur[i] <- 0
    }
    }
    print(df)

However, I cannot seem to find a tidy solution to this. I have tried using the purrr accumulate function but this gives me the wrong output

library(tidyverse)
df <- tibble(event = c(0,0,0,0,1,0,0,1,1,0,0,1))
df <- df %>% mutate(dur = case_when(event==1 ~ 0,
                        T~purrr::accumulate(event,~.x+1,.init=-1)[-1]))
print(df)

Any suggestion on how to do this?


Solution

  • Here is one way to do it capturing your groups via cumsum when event == 1, i.e.

    library(dplyr)
    
    df %>% 
     group_by(grp = cumsum(event == 1)) %>% 
     mutate(dur = seq(n()) - 1)
    
    which gives,
    
    # A tibble: 12 x 3
    # Groups:   grp [5]
       event   grp   dur
       <dbl> <int> <dbl>
     1     0     0     0
     2     0     0     1
     3     0     0     2
     4     0     0     3
     5     1     1     0
     6     0     1     1
     7     0     1     2
     8     1     2     0
     9     1     3     0
    10     0     3     1
    11     0     3     2
    12     1     4     0
    

    NOTE: You can ungroup and remove the column grp at the end If you 'd like