Search code examples
rregressionr-factor

Recoding dummy variable to ordered factor


I need some help with coding factors for a logistic regression.

What I have are six dummy variables representing income brackets. I want to convert these into a single ordered factor for use in a logistic regression.

My data frame looks like:

    INC1 INC2 INC3 INC4 INC5 INC6
1      0    0    1    0    0    0  
2     NA   NA   NA   NA   NA   NA  
3      0    0    0    0    0    1  
4      0    0    0    0    0    1  
5      0    0    1    0    0    0  
6      0    0    0    1    0    0  
7      0    0    1    0    0    0  
8      0    0    0    1    0    0

What I want it to look like:

    INC
1   INC3  
2   NA   
3   INC6  
4   INC6  
5   INC3 
6   INC4  
7   INC3  
8   INC4   

This must be a common (and simple) operation, but my searches have not turned up a concise answer for how to perform this re-coding. Any help is very much appreciated.


Solution

  • Here's a solution based on another answer that keeps the NA values and converts to an ordered factor.

    > inc
      INC1 INC2 INC3 INC4 INC5 INC6
    1    0    0    1    0    0    0
    2   NA   NA   NA   NA   NA   NA
    3    0    0    0    0    0    1
    4    0    0    0    0    0    1
    5    0    0    1    0    0    0
    6    0    0    0    1    0    0
    7    0    0    1    0    0    0
    8    0    0    0    1    0    0
    > inc$F = factor(apply(inc, 1, function(x) names(x)[x == 1]),levels=names(inc),ordered=TRUE)
    
    > inc
      INC1 INC2 INC3 INC4 INC5 INC6    F
    1    0    0    1    0    0    0 INC3
    2   NA   NA   NA   NA   NA   NA <NA>
    3    0    0    0    0    0    1 INC6
    4    0    0    0    0    0    1 INC6
    5    0    0    1    0    0    0 INC3
    6    0    0    0    1    0    0 INC4
    7    0    0    1    0    0    0 INC3
    8    0    0    0    1    0    0 INC4
    > inc$F
    [1] INC3 <NA> INC6 INC6 INC3 INC4 INC3 INC4
    Levels: INC1 < INC2 < INC3 < INC4 < INC5 < INC6
    

    This will break if you have more than one 1 in a row.