I use Hmisc to sign factor names and variable names, and it is very handy. But I found a problem here is the code
a <- c(1,0,1,0,1,0,1,0,1,0)
b <- c("a","b","a","b","a","b","a","b","a","b")
df.new <- data.frame(a,b)
library(Hmisc)
df.new.1 <- upData(df.new,lowernames=TRUE,a=factor(a,labels=c("No","Yes")),b=factor(b,labels=c("No","Yes")))
For character vector give following coding and labels
str(df.new.1$b)
Factor w/ 2 levels "No","Yes": 1 2 1 2 1 2 1 2 1 2
, which is fine.
When you look for coding and labels using str in first case it gives
str(df.new.1$a)
Factor w/ 2 levels "No","Yes": 2 1 2 1 2 1 2 1 2 1 ,
which is weird! Original 0 1 coding is gone. How can I fix this problem ? I would like to keep my original 0 1 variable for later regression purposes. Thanks
As juba's answer explains, this is the expected way for factors to work. However, if you really want both descriptive factor labels and the original numeric values you can add the values as an attribute of the factor, e.g.,
> a <- c(1,0,1,0,1,0,1,0,1,0)
> tmp <- a
> a <- factor(a, labels=c("No","Yes"))
> attr(a, "values") <- tmp
> a
[1] Yes No Yes No Yes No Yes No Yes No
attr(,"values")
[1] 1 0 1 0 1 0 1 0 1 0
Levels: No Yes
> str(a)
Factor w/ 2 levels "No","Yes": 2 1 2 1 2 1 2 1 2 1
- attr(*, "values")= num [1:10] 1 0 1 0 1 0 1 0 1 0
> attributes(a)$values
[1] 1 0 1 0 1 0 1 0 1 0
>