Search code examples
rdataframefunctionfor-loopsapply

Assign the name of the variable to a non NA in a data frame with multiple variables


For example, my df is:

         >dfABy 
         A    B     C

         56   NA  NA
         NA   45  NA
         NA   77  NA 
         67   NA  12 
         NA   65  3

I want to achieve the following data frame

         >dfABy 
         A    B    C

         A    NA  NA
         NA   B   NA
         NA   B   NA 
         A    NA  C
         NA   B   C

Solution

  • Here is an option in base R. Convert the data into a logical matrix with TRUE for non-NA and FALSE for NA. Replicate the column names based on the colum index ('nm1'). Assign the elements in the data based on the index 'i1' with the corresponding column names

    i1 <- !is.na(dfABy)
    nm1 <- names(dfABy)[col(dfABy)]
    dfABy[i1] <- nm1[i1]
    

    -output

    dfABy
    #     A    B    C
    #1    A <NA> <NA>
    #2 <NA>    B <NA>
    #3 <NA>    B <NA>
    #4    A <NA>    C
    #5 <NA>    B    C
    

    Or in a single line

    dfABy[] <- names(dfABy)[col(dfABy)][(NA^is.na(dfABy)) * col(dfABy)]
    

    Or using tidyverse

    library(dplyr)
    dfABy %>%
        mutate(across(everything(), ~ replace(., !is.na(.), cur_column())))
    #     A    B    C
    #1    A <NA> <NA>
    #2 <NA>    B <NA>
    #3 <NA>    B <NA>
    #4    A <NA>    C
    #5 <NA>    B    C
    

    data

    dfABy <- structure(list(A = c(56L, NA, NA, 67L, NA), B = c(NA, 45L, 77L, 
    NA, 65L), C = c(NA, NA, NA, 12L, 3L)), class = "data.frame", row.names = c(NA, 
    -5L))