Is it possible to only merge data for values that are missing?
For example, say I have two datasets. D1 is my priority dataset, but I want to use information from D2 to fill in any missing data in D1. If D1 and D2 have conflicting values, then I want to keep the values in D1 and discard D2.
D1 <- data.frame(
id=seq(1,3),
x=c("cow",NA,"sheep"))
D2 <- data.frame(
id=seq(1,3),
x=c("cow","turtle","parrot"))
Ideally, the final dataset would look like this:
D3 <- data.frame(
id=seq(1,3),
x=c("cow","turtle","sheep"))
turtle
would replace the NA
, but parrot
wouldn't replace sheep
.
In base R, you may use match
-
inds <- is.na(D1$x)
D1$x[inds] <- D2$x[match(D1$id[inds], D2$id)]
D1
# id x
#1 1 cow
#2 2 turtle
#3 3 sheep