library(tidyverse)
testdata <- tibble(ID=c(1,NA,NA,2,NA,3),
Observation = LETTERS[1:6])
testdata1 <- testdata %>%
mutate(
ID1 = case_when(
is.na(ID) ~ lag(ID, 1),
TRUE ~ ID
)
)
testdata1
I have a dataset like testdata, with a valid ID only when ID changes. There can be an arbitrary number of records in a set, but the above case_when and lag() structure does not fill in ID for all records, just for record 2 in each group. Is there a way to get the 3rd (or deeper) IDs filled with the appropriate value?
We can use fill
from the tidyr
package. Since you are using tidyverse
, tidyr
is already inlcuded.
testdata1 <- testdata %>%
fill(ID)
testdata1
# # A tibble: 6 x 2
# ID Observation
# <dbl> <chr>
# 1 1 A
# 2 1 B
# 3 1 C
# 4 2 D
# 5 2 E
# 6 3 F
Or we can use na.locf
from the zoo
package.
library(zoo)
testdata1 <- testdata %>%
mutate(ID = na.locf(ID))
testdata1
# # A tibble: 6 x 2
# ID Observation
# <dbl> <chr>
# 1 1 A
# 2 1 B
# 3 1 C
# 4 2 D
# 5 2 E
# 6 3 F