Search code examples
rdata-conversion

Convert factor into logical datatype


I have a two levels factor in my data that I want to convert to logical

a <- str(df$y)
a
Factor w/ 2 levels "0","1": 1 1 1 1 1 1 1 1 1 1 ...

I use as.logical(df$y) to convert them into logical. However, the data turn into NA

summary(a)

      Mode    NA's 
    logical  500000

At which point do I fail to convert the data?


Solution

  • At which point do I fail to convert the data?

    I'd argue that you at no point fail to convert the data, it's the function that is a bit odd and fails to understand the nature of your data.

    If you read ?as.logical you'll see that when input is factor the levels (which are character) are used in the conversion. The only valid character strings are all variations of "true" and "false", everything else, including "0" and "1", returns NA. 0 and 1 are however interpreted as FALSE and TRUE, respectively, when they are given as numeric, hence all the following works:

    y <- factor(c(0, 1, 1, 0))
    
    as.logical(as.integer(levels(y)[y]))
    as.logical(as.integer(y) - 1L)
    as.logical(as.integer(as.character(y)))
    

    A bit cumbersome, I know, but that's how it is.