Search code examples
rsubsetcriteria

Select row pair based on a value of next line R


I have a df which looks sth like this:

ID   value
1     A
2     C
3     A
4     B
5     C
6     B
7     A
8     B

I need to get a subset of paired rows where the first row has a value A and is followed by row which has a value B. Result should look like this:

ID  value
3    A
4    B
7    A
8    B

Thanks for your help!


Solution

  • In dplyr, we can use lag and lead to get previous and next values.

    library(dplyr)
    
    df %>%  
      filter(value == "A" & lead(value) == "B" | value == "B" & lag(value) == "A")
    
    #  ID value
    #1  3     A
    #2  4     B
    #3  7     A
    #4  8     B
    

    Similarly in data.table, we can use shift :

    library(data.table)
    
    setDT(df)[value == "A" & shift(value, type = "lead") == "B" | 
              value == "B" & shift(value) == "A"]
    

    data

    df <- structure(list(ID = 1:8, value = structure(c(1L, 3L, 1L, 2L, 
    3L, 2L, 1L, 2L), .Label = c("A", "B", "C"), class = "factor")), 
    class = "data.frame", row.names = c(NA, -8L))