I have data that looks like this sample:
A1<-seq(3,5); A2<-seq(1,5); A3<-seq(2,8)
length(A1)<-7; length(A2)<-7; length(A3)<-7
adf <-as.data.frame(cbind(A1,A2,A3))
A1 A2 A3
1 3 1 2
2 4 2 3
3 5 3 4
4 NA 4 5
5 NA 5 6
6 NA NA 7
7 NA NA 8
I would like to create a new variable that pulls from A1, unless it is a missing value, in which case it pulls from A2, unless there is also a missing value, in which case it pulls from A3. This is what I have tried:
adf %>%
mutate(
B = case_when(
!is.na(A1) ~ A1,
is.na(A1) ~ A2,
(is.na(A1) & is.na(A2) & !is.na(A3)) ~ A3
)
)
A1 A2 A3 B
1 3 1 2 3
2 4 2 3 4
3 5 3 4 5
4 NA 4 5 4
5 NA 5 6 5
6 NA NA 7 NA
7 NA NA 8 NA
This works for the most part but as you can see, in cases where both A1 and A2 are missing, but A3 is not, it return NAs instead of A3. What am I missing?
The coalesce
function handles just this use case. You can do:
adf %>% mutate(B=coalesce(A1, A2, A3))
giving:
A1 A2 A3 B
1 3 1 2 3
2 4 2 3 4
3 5 3 4 5
4 NA 4 5 4
5 NA 5 6 5
6 NA NA 7 7
7 NA NA 8 8