Search code examples
rrun-length-encoding

Group the near same numbers of a vector


I have the following vector:

c("a", "a", "b", "a", "a", "c", "c", "c")

and I would like to split its elements into several groups according to the near same value. the result is like this:

[[1]] ("a", "a"), [[2]]("b"), [[3]]("a", "a"), [[4]]("c", "c", "c")

although the element of group 1 and group 3 is the same, they are not neighbor. so they belong to different group. I try to using for loop to do it, but it is not good enough.


Solution

  • Another option but with rleid from data.table package

    > split(v,rleid(v))
    $`1`
    [1] "a" "a"
    
    $`2`
    [1] "b"
    
    $`3`
    [1] "a" "a"
    
    $`4`
    [1] "c" "c" "c"
    

    or another base R option

    > split(v,cumsum(c(TRUE,head(v,-1)!=v[-1])))
    $`1`
    [1] "a" "a"
    
    $`2`
    [1] "b"
    
    $`3`
    [1] "a" "a"
    
    $`4`
    [1] "c" "c" "c"