Search code examples
rnazoolocf

Limit na.locf in zoo package


I would like to do a last observation carried forward for a variable, but only up to 2 observations. That is, for gaps of data of 3 or more NA, I would only carry the last observation forward for the next 2 observations and leave the rest as NA.

If I do this with the zoo::na.locf, the maxgap parameter implies that if the gap is larger than 2, no NA is replaced. Not even the last 2. Is there any alternative?

x <- c(NA,3,4,5,6,NA,NA,NA,7,8)
zoo::na.locf(x, maxgap = 2) # Doesn't replace the first 2 NAs of after the 6 as the gap of NA is 3. 
Desired_output <- c(NA,3,4,5,6,6,6,NA,7,8)

Solution

  • A solution using base R:

    ave(x, cumsum(!is.na(x)), FUN = function(i){ i[1:pmin(length(i), 3)] <- i[1]; i })
    # [1] NA  3  4  5  6  6  6 NA  7  8
    

    cumsum(!is.na(x)) groups each run of NAs with most recent non-NA value.

    function(i){ i[1:pmin(length(i), 3)] <- i[1]; i } transforms the first two NAs of each group into the leading non-NA value of this group.