Search code examples
rforcats

Using forcats with data that has possible levels not present in data


My data has levels that are theoretically possible but not present in the data. I can easily represent this in base R:

factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))

If I table it, I see that test3 is 0. This is great and allows for the possibility that I can write functions assuming these levels include all possible outcomes in case data is eventually added that includes this level.

I can not replicate this in forcats. First off, the as_factor function does not accept any additional arguments:

forcats::as_factor(c("test","test1","test2"), levels = c("test","test1","test2","test3"))

The above produces an error.

The following works with a warning (which I would prefer to accomplish my goal without warnings, if possible):

forcats::as_factor(c("test","test1","test2")) %>% forcats::fct_recode(`test` = "test", `tests` = "test1", `tests` = "test2", `tests` = "test3")

Warning message:
Unknown levels in `f`: test3 

Is there any way in forcats to play with levels that theoretically exist but are not necessarily in the data at that moment?


Solution

  • If we want to replicate the same behavior with factor, may be use fct_expand

    c("test","test1","test2") %>%
           forcats::fct_expand(c("test","test1","test2","test3"))
    #[1] test  test1 test2
    #Levels: test test1 test2 test3
    

    Regarding the use of ... (other arguments in as_factor), it is not actually used

    library(forcats)
    methods(as_factor)
    #[1] as_factor.character* as_factor.factor*    as_factor.logical*   as_factor.numeric*  
    

    Now, we check the code of as_factor.character

    getAnywhere(as_factor.character)
    function (x, ...) 
    {
        structure(fct_inorder(x), label = attr(x, "label", exact = TRUE))
    }
    

    The fct_inorder takes only the 'x' and not any other arguments passed in with ...

    Here, we can use fct_expand directly to expand the levels of the factor or character (converts to factor)