Preserve ordered factor when using ddply

I use ddply a lot. I use ordered factors occasionally. Calling ddply on a data frame that contains an ordered factor drops any ordering in the recombined data frame.

I wrote the following wrapper for ddply that records level ordering and then re-applies it on any columns that were ordered originally:

dat <- data.frame(a=runif(10),b=factor(letters[10:1],
                  c = rep(letters[1:2],times=5),
                  d = factor(rep(c('lev1','lev2'),times=5),ordered=TRUE))

#Drops ordering on b and d      
dat1 <- ddply(dat,.(c),transform,log_a = log(a))

ddplyKeepOrder <- function(dat,...){
    orderedCols <- colnames(dat)[sapply(dat,is.ordered)]
    levs <- lapply(dat[,orderedCols,drop=FALSE],levels)
    result <- ddply(.data = dat,...)

    ind <- match(orderedCols,colnames(result))
    levs <- levs[!]
    orderedCols <- orderedCols[!]
    ind <- ind[!]
    if (length(ind) > 0){
        for (i in 1:length(ind)){
            result[,orderedCols[i]] <- factor(result[,orderedCols[i]],

#Preserves ordering on b and d
dat2 <- ddplyKeepOrder(dat,.variables = .(c),.fun = transform,log_a = log(a))

I haven't checked this function thoroughly so there might be cases it doesn't handle. Is there a better/more complete way to handle this? I could probably remove the for loop if I thought about it a bit, I suppose.

In particular, the checking I do after the ddply call to see if there are still any of the original ordered factors present seems really ugly, but I would like the function to be able to handle cases where ddply alters which columns are present, possibly removing ordered factors.



  • I use the code below for these types of problems ("ddply" not "ordered factor") and it seems to handle your specific example without issue (other than different row names).

    > dat2 <-, lapply(split(dat, dat$c), transform, log_a=log(a)))
    > str(dat2)
    'data.frame':   10 obs. of  5 variables:
     $ a    : num  0.216 0.607 0.197 0.171 0.797 ...
     $ b    : Ord.factor w/ 10 levels "j"<"i"<"h"<"g"<..: 1 3 5 7 9 2 4 6 8 10
     $ c    : Factor w/ 2 levels "a","b": 1 1 1 1 1 2 2 2 2 2
     $ d    : Ord.factor w/ 2 levels "lev1"<"lev2": 1 1 1 1 1 2 2 2 2 2
     $ log_a: num  -1.532 -0.499 -1.625 -1.767 -0.227 ...