Search code examples
rdplyrna

R Fill NA With Adding 5 to Previous Value


HAVE = data.frame(STUDENT = c(1,1,1,2,2,2,3,3),
                  TIME = c(1,2,3,1,2,3,1,2),
                  SCORE = c(7, NA, NA, 5, NA, 19, NA, 2))

WANT = data.frame(STUDENT = c(1,1,1,2,2,2,3,3),
                  TIME = c(1,2,3,1,2,3,1,2),
                  SCORE = c(7, 12, 17, 5, 10, 19, NA, 2))

I wish to modify SCORE by doing the following step: If is.na(SCORE) then take previous value for STUDENT and add 5 to it. But only do this downwards not upwards.

I try this with no success, it doesn't work when there is two NA in a row for the STUDENT

HAVE %>% group_by(STUDENT) %>% mutate(WANT = ifelse(is.na(SCORE), lag(SCORE) + 5, SCORE))

Solution

  • You can use purrr::accumulate:

    library(dplyr)
    library(purrr)
    HAVE |> 
      mutate(SCORE = accumulate(SCORE, ~ if(is.na(.y)) .x + 5 else .y),
             .by = STUDENT)
    
    #   STUDENT TIME SCORE
    # 1       1    1     7
    # 2       1    2    12
    # 3       1    3    17
    # 4       2    1     5
    # 5       2    2    10
    # 6       2    3    19
    # 7       3    1    NA
    # 8       3    2     2
    

    Or, in base R, with Reduce and accumulate = TRUE:

    na_add <- function(col, n = 5){
      Reduce(function(x, y) if(is.na(y)) x + n else y, col, accumulate = TRUE)
    }
    with(HAVE, ave(SCORE, STUDENT, FUN = na_add))
    #[1]  7 12 17  5 10 19 NA  2