Search code examples
rdata.tablerecode

Recode NA with values from similar ID with data.table


I'm in a learning process to use data.table and trying to recode NA to the non-missing values by b.

library(data.table)
dt <- data.table(a = rep(1:3, 2),
                 b = c(rep(1,3), rep(2, 3)),
                 c = c(NA, 4, NA, 6, NA, NA))

> dt
   a b  c
1: 1 1 NA
2: 2 1  4
3: 3 1 NA
4: 1 2  6
5: 2 2 NA
6: 3 2 NA

I would like to get this:

> dt
   a b  c
1: 1 1  4
2: 2 1  4
3: 3 1  4
4: 1 2  6
5: 2 2  6
6: 3 2  6

I tried these, but none gives the desired result.

dt[, c := ifelse(is.na(c), !is.na(c), c), by = b]
dt[is.na(c), c := dt[!is.na(c), .(c)], by = b]

Appreciate to get some helps and a little bit explanation on how should I consider/think when trying to solve the problem with data.table approach.


Solution

  • Assuming a simple case where there is just one c for each level of b:

    dt[, c := c[!is.na(c)][1], by = b]
    dt