Search code examples
rcumsum

cumsum only for certain group ID


I want to give each run of negative results in a series an ID. The series below is represented by vector let's call it a. The run of negative values where a == 0 should all receive the same ID, i.e. 1. The next time we get a run of negative values I want a new ID to be given i.e. ID == 2. I need to preserve all rows and give zeros to runs of positive results. Please see the example below where a sample series and desired outcome are demonstrated.

data.frame(a=rep(c(1, 0, 1, 0), each=4), ID=rep(c(0, 1, 0, 2), each=4))
       a ID
    1  1  0
    2  1  0
    3  1  0
    4  1  0
    5  0  1
    6  0  1
    7  0  1
    8  0  1
    9  1  0
    10 1  0
    11 1  0
    12 1  0
    13 0  2
    14 0  2
    15 0  2
    16 0  2

Solution

  • Try this

    a <- rep(c(1,0,1,0,1 ,0), each=4)
    #================================
    
    ID <- c()
    r <- rle(a) ; j <- 1L
    for(i in seq_along(r$lengths)){
      if(r$values[i] == 1) ID <- c(ID , rep(0 , r$lengths[i]))
      else{
        ID <- c(ID , rep(j , r$lengths[i]))
        j <- j + 1L
      }
    }
    #================================
    df <- data.frame(a = a , ID = ID)
    df
    #>    a ID
    #> 1  1  0
    #> 2  1  0
    #> 3  1  0
    #> 4  1  0
    #> 5  0  1
    #> 6  0  1
    #> 7  0  1
    #> 8  0  1
    #> 9  1  0
    #> 10 1  0
    #> 11 1  0
    #> 12 1  0
    #> 13 0  2
    #> 14 0  2
    #> 15 0  2
    #> 16 0  2
    #> 17 1  0
    #> 18 1  0
    #> 19 1  0
    #> 20 1  0
    #> 21 0  3
    #> 22 0  3
    #> 23 0  3
    #> 24 0  3
    

    Created on 2022-06-22 by the reprex package (v2.0.1)