This question is related to Convert factor to integer and How to convert a factor to an integer\numeric without a loss of information but has a slightly different problem with type coercion.
The two former question seem to deal with cases were a factor is explicitly constructed from a previously existing vector of class numeric
or of class integer
without relabeling the levels
. In these cases:
f <- factor(c("1","2","1","2"))
as.numeric(levels(f))[f]
returns
# [1] 1 2 1 2
but when I relabel the levels:
f <- factor(c("1","2","1","2"))
f <- factor(f,
levels = c(1, 2),
labels = c("a", "b"))
as.numeric(levels(f))[f]
I will get
# [1] NA NA NA NA
# Warning message:
# NAs introduced by coercion
whereas
as.numeric(f)
returns
# [1] 1 2 1 2
What is the right procedure in such a case to get the original values back? Is it just as.numeric(f)
?
In case it's relevant:
> sessionInfo()
R version 3.1.2 RC (2014-10-28 r66890)
Platform: x86_64-unknown-linux-gnu (64-bit)
locale:
[1] LC_CTYPE=en_IE.UTF-8 LC_NUMERIC=C
[3] LC_TIME=en_IE.UTF-8 LC_COLLATE=en_IE.UTF-8
[5] LC_MONETARY=en_IE.UTF-8 LC_MESSAGES=en_IE.UTF-8
[7] LC_PAPER=en_IE.UTF-8 LC_NAME=C
[9] LC_ADDRESS=C LC_TELEPHONE=C
[11] LC_MEASUREMENT=en_IE.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] tools_3.1.2
If you know for a certainty that there is an exact correspondence between the original levels and the underlying factor/integer encoding, then you can use as.numeric(f). But ... if the original vector were
f <- factor(c("2","3","2","3"))
And you changed the level-labels to alpha values, then as.numeric(f) would give misleading results. The factor encoding always starts with 1L
.