Search code examples
rbinary-datar-factor

How to call the factor level of a variable for each observation, and use those values to create a new variable in R?


I have a dataset with a categorical variable hospital_code which has 10 levels.

The program that I am running loops through and takes a subset of the data such that the variable compLbl contains exactly 2 of the 10 hospital_codes so that they can be compared to each other. I now have a situation where in each loop, I need compLbl to be binary coded (1s, and 0s).

If I just take the subset data from the first loop in which the possible values for compLbl are AMH, and BJH, I can easily do this as follows:

nData$compLbl2 = with(nData,(ifelse(compLbl == "AMH", 1,0)))

And get data that looks like this:

head(nData)
compLbl outLbl Race_Code Age Complexity_Subclass_Code compLbl2
1     AMH      0         W  63                        1        1
2     AMH      0         W  44                        2        1
3     AMH      0         W  88                        3        1
4     BHC      0         W  64                        1        0
5     BHC      0         W  61                        2        0
6     BHC      0         W  61                        1        0

How can I generalize this so that no matter what two values are in compLbl it will binary code them? My thought was to possibly do this by referencing factor level 1 for whatever two values are present in the factor variable compLbl. Like this:

nData$compLbl2 = with(nData,(ifelse(FACTORLEVEL(compLbl) == 1, 1,0)))

Where in my above example FACTORLEVEL(compLbl) would return a 1 for AMH and a 2 for BHC since those are the factor levels that R would automatically assign. However, I'm not sure how to do this, or if it is possible.


Solution

  • I would use this command:

    nData <- within(nData, compLbl2 = rev(as.numeric(compLbl[drop = TRUE]) -1))