I have a dataset with a categorical variable hospital_code
which has 10 levels.
The program that I am running loops through and takes a subset of the data such that the variable compLbl
contains exactly 2 of the 10 hospital_codes so that they can be compared to each other. I now have a situation where in each loop, I need compLbl to be binary coded (1s, and 0s).
If I just take the subset data from the first loop in which the possible values for compLbl are AMH
, and BJH
, I can easily do this as follows:
nData$compLbl2 = with(nData,(ifelse(compLbl == "AMH", 1,0)))
And get data that looks like this:
head(nData)
compLbl outLbl Race_Code Age Complexity_Subclass_Code compLbl2
1 AMH 0 W 63 1 1
2 AMH 0 W 44 2 1
3 AMH 0 W 88 3 1
4 BHC 0 W 64 1 0
5 BHC 0 W 61 2 0
6 BHC 0 W 61 1 0
How can I generalize this so that no matter what two values are in compLbl
it will binary code them? My thought was to possibly do this by referencing factor level 1 for whatever two values are present in the factor variable compLbl. Like this:
nData$compLbl2 = with(nData,(ifelse(FACTORLEVEL(compLbl) == 1, 1,0)))
Where in my above example FACTORLEVEL(compLbl)
would return a 1 for AMH
and a 2 for BHC
since those are the factor levels that R would automatically assign. However, I'm not sure how to do this, or if it is possible.
I would use this command:
nData <- within(nData, compLbl2 = rev(as.numeric(compLbl[drop = TRUE]) -1))