Search code examples
rdataframedplyrlaglead

R lag/lead - how to ignore rows before and after existing rows


I have this dataframe in R

   id      a        b        c        d
1  42      3        2       NA        5
2  42     NA        6       NA        6
3  42      1       NA        7        8

With function like this

library(dplyr)

dataframe %>%
 mutate(e = lead(d)) 

I get at third row NA since there is not fourth row, but how can I get value from first row - 5? Result should look like this

   id      a        b        c        d         e
1  42      3        2       NA        5         6
2  42     NA        6       NA        6         8
3  42      1       NA        7        8         5

Solution

  • We can use the first function in the default argument of the lead function.

    library(dplyr)
    
    dat2 <- dat %>%
      mutate(e = lead(d, default = first(d)))
    dat2
    #   id  a  b  c d e
    # 1 42  3  2 NA 5 6
    # 2 42 NA  6 NA 6 8
    # 3 42  1 NA  7 8 5
    

    DATA

    dat <- read.table(text = "   id      a        b        c        d
    1  42      3        2       NA        5
    2  42     NA        6       NA        6
    3  42      1       NA        7        8",
                      header = TRUE)