Search code examples
rr-factorcoercion

R dataframe factors


I want to droplevels a dataframe (please do not mark this question as duplicate :)). Given all the methods available only one works. What am I doing wrong? Example:

> df = data.frame(x = (c("a","b","c")),y=c("d","e","f"))
> class(df$x)
[1] "factor"
> levels(df$x)
[1] "a" "b" "c"

Method 1 not working:

> df1 = droplevels(df)
> class(df1$x)
[1] "factor"
> levels(df1$x)
[1] "a" "b" "c"

Method 2 not working:

> df2 = as.data.frame(df, stringsAsFactors = FALSE) 
> class(df2$x)
[1] "factor"
> levels(df2$x)
[1] "a" "b" "c"

Method 3 not working:

> df3 = df
> df3$x = factor(df3$x) 
> class(df3$x)
[1] "factor"
> levels(df3$x)
[1] "a" "b" "c"

Method 4 finally works:

> df4 = df
> df4$x = as.vector(df4$x)
> class(df4$x)
[1] "character"
> levels(df4$x)
NULL

While working, I think method 4 is the least elegant. Can you help me to debug this? Many thanks

EDIT: Following comments and answers: I want to remove the factor structure from a data frame and not only droplevels


Solution

  • "Dropping levels" refers to getting rid of unused factor levels, but keeping the object as class factor. You're looking for a way to convert all factor columns into character columns:

    > df2 = data.frame(lapply(df, 
               function(x) if (is.factor(x)) as.character(x) else x), 
                  stringsAsFactors = FALSE)
    > lapply(df2, class)
    $x
    [1] "character"
    
    $y
    [1] "character"
    
    > df2
      x y
    1 a d
    2 b e
    3 c f