Search code examples
rapplyr-factor

Why `as.factor` does not work when applied via `apply` function in R?


I wonder why as.factor function does not work when applied via apply function in R?

> df.nrow <- 10
> df <- data.frame(col1=sample(c("a","b","c"), df.nrow, TRUE),
+                  col2=sample(c("d","e","f"), df.nrow, TRUE),
+                  col3=sample(c("g","h","i"), df.nrow, TRUE))
> apply(df, 2, is.factor)
 col1  col2  col3 
FALSE FALSE FALSE 
> df <- apply(df, 2, as.factor)
> apply(df, 2, is.factor)
 col1  col2  col3 
FALSE FALSE FALSE 

Solution

  • I think this is because of how apply simplifies the result to return a matrix. From ?apply:

    If ‘X’ is not an array but an object of a class with a non-null
        ‘dim’ value (such as a data frame), ‘apply’ attempts to coerce it
        to an array via ‘as.matrix’ if it is two-dimensional (e.g., a data
        frame) or via ‘as.array’.
    

    In fact your original data frame is as you wish. Try str(df) or sapply(df, is.factor) to verify it. Basically character vectors are always coerced to factors, unless stringsAsFactors=FALSE.