I have a column primary
in a dataframe with already with set values. I'm trying to write a code where if all columns that start with "dx" are NA, then the NA, otherwise, print the original value.
To note, this is only a segment of the dataframe, there are many other columns
My current dataframe
# dx1 dx2 dx3 dx4 dx5 primary
# 1 I629 <NA> NA NA NA Unspecified
# 2 S065 <NA> NA NA NA S065
# 3 I629 S066 NA NA NA I629
# 4 I629 I629 NA NA NA Unspecified
# 5 NA NA NA NA NA Unspecified
Desired output:
# dx1 dx2 dx3 dx4 dx5 primary
# 1 I629 <NA> NA NA NA Unspecified
# 2 S065 <NA> NA NA NA S065
# 3 I629 S066 NA NA NA I629
# 4 I629 I629 NA NA NA Unspecified
# 5 NA NA NA NA NA NA
With dplyr
library(tidyverse)
df %>%
mutate(primary = case_when(
if_all(starts_with("dx"), is.na) ~ NA_character_,
T ~ primary
))
# A tibble: 5 × 6
dx1 dx2 dx3 dx4 dx5 primary
<chr> <chr> <lgl> <lgl> <lgl> <chr>
1 I629 <NA> NA NA NA Unspecified
2 S065 <NA> NA NA NA S065
3 I629 S066 NA NA NA I629
4 I629 I629 NA NA NA Unspecified
5 NA NA NA NA NA NA