Search code examples
rlabellevelshmisc

Hmisc package changes original codes from 0:1 to 1:2


I use Hmisc to sign factor names and variable names, and it is very handy. But I found a problem here is the code

a <- c(1,0,1,0,1,0,1,0,1,0)
b <- c("a","b","a","b","a","b","a","b","a","b")
df.new <- data.frame(a,b)
library(Hmisc)
df.new.1 <- upData(df.new,lowernames=TRUE,a=factor(a,labels=c("No","Yes")),b=factor(b,labels=c("No","Yes")))

For character vector give following coding and labels

str(df.new.1$b)

 Factor w/ 2 levels "No","Yes": 1 2 1 2 1 2 1 2 1 2

, which is fine.

When you look for coding and labels using str in first case it gives

str(df.new.1$a)

 Factor w/ 2 levels "No","Yes": 2 1 2 1 2 1 2 1 2 1 ,

which is weird! Original 0 1 coding is gone. How can I fix this problem ? I would like to keep my original 0 1 variable for later regression purposes. Thanks


Solution

  • As juba's answer explains, this is the expected way for factors to work. However, if you really want both descriptive factor labels and the original numeric values you can add the values as an attribute of the factor, e.g.,

    > a <- c(1,0,1,0,1,0,1,0,1,0)
    > tmp <- a
    > a <- factor(a, labels=c("No","Yes"))
    > attr(a, "values") <- tmp
    > a
     [1] Yes No  Yes No  Yes No  Yes No  Yes No 
    attr(,"values")
     [1] 1 0 1 0 1 0 1 0 1 0
    Levels: No Yes
    > str(a)
     Factor w/ 2 levels "No","Yes": 2 1 2 1 2 1 2 1 2 1
     - attr(*, "values")= num [1:10] 1 0 1 0 1 0 1 0 1 0
    > attributes(a)$values
     [1] 1 0 1 0 1 0 1 0 1 0
    >