Search code examples
rapplysapply

Why is.factor() used in apply() and sapply() returns different values?


My objective is to get a vector of boolean values to indicate whether or not each column in a data.frame is a factor, I used is.factor() in sapply() and apply() functions, and it seems like they return different value, and apply() returns the wrong value. Can someone tell me what causes the difference?

X <- data.frame(X1=c(1,2,3,4), ## numeric
                X2=factor(paste0("f",c(1:4))) ## factor)

sapply(X, is.factor)
## FALSE  TRUE 

apply(X, 2, is.factor)
## FALSE FALSE // apparently this is wrong, the second value is supposed to be TRUE.

Same thing happened to other functions like class(), is.numeric().


Solution

  • From the reference of apply:

    Returns a vector or array or list of values obtained by applying a function to margins of an array or matrix.

    Therefore, it converts your input object to a matrix (array) first which must have the same atomic data type. This means that your data get coerced to character, because factor is not an atomic vector type.

    > as.matrix(X)
         X1  X2  
    [1,] "1" "f1"
    [2,] "2" "f2"
    [3,] "3" "f3"
    [4,] "4" "f4"