Search code examples
rdplyrcategorical-datadummy-variable

Transform dummy variable into categorical variable


Here is my dataframe

data<-data.frame(
ID=c(1:8),
Diag1=c(1,0,1,0,1,0,1,0),
Diag2=c(0,1,0,1,0,0,1,0),
Diag3=c(0,0,0,1,0,1,1,0),
Multiple.Diag=c(0,0,1,1,0,0,1,0)
)

I have patients with different diagnoses, some of them have multiple diagnoses. These diagnoses are dummy variables that need to be converted into categorical variables. If the patient has Mult.diag==1, his diagnosis will be Multiple.diag, otherwise his diagnosis will be either Diag1, Diag2 or Diag3. If the patient has 0 for the whole variables, the diagnosis will be "Other".

Here is what I want to have:

  ID     Diagnosis
1  1         Diag1
2  2         Diag2
3  3 Multiple.Diag
4  4 Multiple.Diag
5  5         Diag1
6  6         Diag3
7  7 Multiple.Diag
8  8         Other

Solution

  • with tidyverse you could also do:

    data %>% 
      pivot_longer(-ID) %>%
      group_by(ID) %>%
      slice(which.max(as.integer(factor(name))*value))%>%
      mutate(name = if_else(value == 0, 'other',name), value= NULL)
     # A tibble: 8 x 2
    # Groups:   ID [8]
         ID name         
      <int> <chr>        
    1     1 Diag1        
    2     2 Diag2        
    3     3 Multiple.Diag
    4     4 Multiple.Diag
    5     5 Diag1        
    6     6 Diag3        
    7     7 Multiple.Diag
    8     8 other