Search code examples
rnamutate

R: if column that starts with a value is all na, then print na


I have a column primary in a dataframe with already with set values. I'm trying to write a code where if all columns that start with "dx" are NA, then the NA, otherwise, print the original value.

To note, this is only a segment of the dataframe, there are many other columns

My current dataframe

#    dx1  dx2 dx3 dx4 dx5     primary
# 1 I629 <NA>  NA  NA  NA Unspecified
# 2 S065 <NA>  NA  NA  NA        S065
# 3 I629 S066  NA  NA  NA        I629
# 4 I629 I629  NA  NA  NA Unspecified
# 5 NA   NA    NA  NA  NA Unspecified

Desired output:

#    dx1  dx2 dx3 dx4 dx5     primary
# 1 I629 <NA>  NA  NA  NA Unspecified
# 2 S065 <NA>  NA  NA  NA        S065
# 3 I629 S066  NA  NA  NA        I629
# 4 I629 I629  NA  NA  NA Unspecified
# 5 NA   NA    NA  NA  NA NA

Solution

  • With dplyr

    library(tidyverse) 
    
    df %>%
      mutate(primary = case_when(
        if_all(starts_with("dx"), is.na) ~ NA_character_,
                                       T ~ primary
        ))
    
    # A tibble: 5 × 6
      dx1   dx2   dx3   dx4   dx5   primary    
      <chr> <chr> <lgl> <lgl> <lgl> <chr>      
    1 I629  <NA>  NA    NA    NA    Unspecified
    2 S065  <NA>  NA    NA    NA    S065       
    3 I629  S066  NA    NA    NA    I629       
    4 I629  I629  NA    NA    NA    Unspecified
    5 NA    NA    NA    NA    NA    NA