Search code examples
rdataframenarbindr-factor

Why does R change the variable type when prepending NA values to a data frame with factors?


I have a problem with the way R coerces variable types when using rbind of two data.frames with NA values. I illustrate by example:

x<-factor(sample(1:3,10,T))
y<-rnorm(10)
dat<-data.frame(x,y)
NAs<-data.frame(matrix(NA,ncol=ncol(dat),nrow=nrow(dat)))
colnames(NAs)<-colnames(dat)

Now the goal is to append dat and NAs while keeping the variable types factor and numeric of x and y. When I give:

dat_forward<-rbind(dat,NAs)
is.factor(dat_forward$x)

this works fine. However the backward direction using rbind fails:

dat_backward<-rbind(NAs,dat)
is.factor(dat_backward$x)
is.character(dat_backward$x)

Now x is coerced to character level. I am confused - can't it stay factor type even if I use the other order of binding? What would be a straight forward change to my code to reach my goal?


Solution

  • Here's a fairly simple way to get the column classes right:

    x <- rbind(dat[1,], NAs, dat)[-1,]
    str(x)
    #  $ x: Factor w/ 3 levels "1","2","3": NA NA NA NA NA NA NA NA NA NA ...
    #  $ y: num  NA NA NA NA NA NA NA NA NA NA ...
    

    More generally, if you are really needing this often, you could create an rbind-like function that takes an additional argument indicating the data.frame to whose column classes you'd like to coerce all of the others' columns:

    myrbind <- function(x, ..., template=x) {
        do.call(rbind, c(list(template[1,]), list(x), list(...)))[-1,]
    }
    
    str(myrbind(NAs, dat,  template=dat))
    # 'data.frame': 20 obs. of  2 variables:
    #  $ x: Factor w/ 3 levels "1","2","3": NA NA NA NA NA NA NA NA NA NA ...
    #  $ y: num  NA NA NA NA NA NA NA NA NA NA ...
    
    ## If no 'template' argument is supplied, myrbind acts just like rbind    
    str(myrbind(dat, NAs))
    # 'data.frame': 20 obs. of  2 variables:
    #  $ x: Factor w/ 3 levels "1","2","3": 3 3 3 3 2 3 1 1 3 2 ...
    #  $ y: num  0.303 1.77 -1.38 1.731 0.033 ...