Search code examples
rpositiondata.tableevaluate

Evaluating same column data.table in r


How can I evaluate a column of a data.table with values of the same column, each value against the value of the next two positions. The following example ilustrates the problem and desired result.

library(data.table)
dt <- data.table(a = c(2, 3, 2, 4))   
result <- data.table(a = c(2, 3, 2, 4), b = c(T, F, NA, NA))

Solution

  • We can use shift to create two lead columns based on 'a' by specifying n= 1:2. Loop through the columns with lapply, check whether it is equal to 'a', Reduce it to a single logical vector with | and assign it to 'b' column

    dt[, b := Reduce(`|`, lapply(shift(a, 1:2, type = 'lead'), `==`,  a))]
    dt
    #   a     b
    #1: 2  TRUE
    #2: 3 FALSE
    #3: 2    NA
    #4: 4    NA
    

    As @Mike H. suggested if we are comparing only for the next values, then doing this individually may be better to understand

    dt[, b := (shift(a, 1, type = 'lead') == a) | (shift(a, 2, type = 'lead') ==a)]