Search code examples
rdplyracross

Concatenate column names in one column conditional on using mutate, across and case_when


I would like to:

  1. Use across and case_when to check if columns A1-A3 == 1
  2. Concatenate the column names of the columns where A1-A3 == 1 and
  3. mutate a new column with the concatenated column names

My dataframe:

df <- tribble(
~ID,    ~A1,    ~A2,    ~A3,
1, 0, 1, 1, 
2, 0, 1, 1, 
3, 1, 1, 1, 
4, 1, 0, 1, 
5, 0, 1, 0)

Desired Output:

# A tibble: 5 x 5
     ID    A1    A2    A3 New_Col 
  <dbl> <dbl> <dbl> <dbl> <chr>   
1     1     0     1     1 A2 A3   
2     2     0     1     1 A2 A3   
3     3     1     1     1 A1 A2 A3
4     4     1     0     1 A1 A3   
5     5     0     1     0 A2   

So far I have tried:

df %>% 
  rowwise() %>% 
  mutate(New_Col = across(A1:A3, ~ case_when(. == 1 ~ paste0("colnames(.)", collapse = " "))))

Not working Output:

     ID    A1    A2    A3 New_Col$A1  $A2         $A3        
  <dbl> <dbl> <dbl> <dbl> <chr>       <chr>       <chr>      
1     1     0     1     1 NA          colnames(.) colnames(.)
2     2     0     1     1 NA          colnames(.) colnames(.)
3     3     1     1     1 colnames(.) colnames(.) colnames(.)
4     4     1     0     1 colnames(.) NA          colnames(.)
5     5     0     1     0 NA          colnames(.) NA   

What I want to learn:

  1. Is it possible to use across to check for conditions across multiple columns
  2. If yes how looks the part after ~ of case_when to get the specific colnames
  3. How can I get only one column after using mutate, across and case_when and not 3 like here.

I thought I already was able to master this task, but somehow I lost it...


Solution

  • To use across with case_when you can do -

    library(dplyr)
    library(tidyr)
    
    df %>% 
      mutate(across(A1:A3, ~case_when(. == 1 ~ cur_column()), .names = 'new_{col}')) %>%
      unite(New_Col, starts_with('new'), na.rm = TRUE, sep = ' ')
    
    #    ID    A1    A2    A3 New_Col 
    #  <dbl> <dbl> <dbl> <dbl> <chr>   
    #1     1     0     1     1 A2 A3   
    #2     2     0     1     1 A2 A3   
    #3     3     1     1     1 A1 A2 A3
    #4     4     1     0     1 A1 A3   
    #5     5     0     1     0 A2      
    

    across creates 3 new columns named new_A1, new_A2 and new_A3 with the column name if the value is 1 or NA otherwise. Using unite we combine the 3 columns into one New_col.


    Also we can use rowwise with c_across -

    df %>% 
      rowwise() %>% 
      mutate(New_Col = paste0(names(.[-1])[c_across(A1:A3) == 1], collapse = ' '))