Search code examples
rlevels

unexpected behavior when extracting factor levels


Can someone explain why levels() shows three factor levels, while you can see that the vector has only two?

> str(walk.df)
'data.frame':   10 obs. of  4 variables:
 $ walker : Factor w/ 3 levels "1","2","3": 1 1 1 1 1 2 2 2 2 2

> walk.df$walker
 [1] 1 1 1 1 1 2 2 2 2 2
Levels: 1 2 3

I would like to extract a vector of levels, and I thought this was the proper way, but as you can see, a three sneaks in there which is messing up my function.

> as.numeric(levels(walk.df$walker))
[1] 1 2 3

Solution

  • probably walk.df is a subset of the factor variable with 3 levels. say,

    a<-factor(1:3)
    b<-a[1:2]
    

    then b has 3 levels.

    A easy way to drop extra level is:

    b<-a[1:2, drop=T]
    

    or if you cannot access the original variable,

    b<-factor(b)