Search code examples
rnafill

R - How to fill in values in NA, but only when ending value is the same as the beginning value?


I have the following example data:

Example <- data.frame(col1 =c(1, NA, NA, 4, NA, NA, 6, NA, NA, NA, 6, 8, NA, 2, NA))

col1
1
NA
NA
4
NA
NA
6
NA
NA
NA
6
8
NA
2
NA

I want to fill the NAs with value from above, but only if the NAs are between 2 identical values. In this example the first NA gap from 1 to 4 should not be filled with 1s. But the gap between the first 6 and the second 6 should be filled, with 6s. All other values should stay NA. Therefore, afterwards it should look like:

col1
1
NA
NA
4
NA
NA
6
6
6
6
6
8
NA
2
NA

But in reality I do not have only 15 observations, but over 50000. Therefore I need a efficient solution, which is more difficult than I thought. I tried to use the Fill function but was not able to come up with a solution.


Solution

  • One dplyr and zoo option could be:

    df %>%
        mutate(cond = na.locf0(col1) == na.locf0(col1, fromLast = TRUE),
               col1 = ifelse(cond, na.locf0(col1), col1)) %>%
        select(-cond)
    
       col1
    1     1
    2    NA
    3    NA
    4     4
    5    NA
    6    NA
    7     6
    8     6
    9     6
    10    6
    11    6
    12    8
    13   NA
    14    2
    15   NA