My data has levels that are theoretically possible but not present in the data. I can easily represent this in base
R:
factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))
If I table it, I see that test3
is 0. This is great and allows for the possibility that I can write functions assuming these levels include all possible outcomes in case data is eventually added that includes this level.
I can not replicate this in forcats
. First off, the as_factor
function does not accept any additional arguments:
forcats::as_factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))
The above produces an error.
The following works with a warning (which I would prefer to accomplish my goal without warnings, if possible):
forcats::as_factor(c("test","test1","test2")) %>% forcats::fct_recode(`test` = "test", `tests` = "test1", `tests` = "test2", `tests` = "test3")
Warning message:
Unknown levels in `f`: test3
Is there any way in forcats
to play with levels that theoretically exist but are not necessarily in the data at that moment?
If we want to replicate the same behavior with factor
, may be use fct_expand
c("test","test1","test2") %>%
forcats::fct_expand(c("test","test1","test2","test3"))
#[1] test test1 test2
#Levels: test test1 test2 test3
Regarding the use of ...
(other arguments in as_factor
), it is not actually used
library(forcats)
methods(as_factor)
#[1] as_factor.character* as_factor.factor* as_factor.logical* as_factor.numeric*
Now, we check the code of as_factor.character
getAnywhere(as_factor.character)
function (x, ...)
{
structure(fct_inorder(x), label = attr(x, "label", exact = TRUE))
}
The fct_inorder
takes only the 'x' and not any other arguments passed in with ...
Here, we can use fct_expand
directly to expand the levels
of the factor
or character
(converts to factor
)