Search code examples
rdata-manipulation

Breaking ties with more than 1 columns without using package R


Suppose I have the following matrix:

output <- cbind(c(1, 2, 3, 4, 5), c(1, 1, 2, 3, 1),
                c(1, 1, 1, 2, 1), c(1, 1, 1, 1, 1))

I want to break ties for the 4th column with the 3rd and then with 2nd and then the 1st, i.e. I would like to end up obtaining this matrix instead.

output_1 <- cbind(c(1,2,5,3,4), c(1,1,1,2,3), c(1,1,1,1,2), c(1,1,1,1,1))

I tried to do output_1 <- output[order(output[,4],output[,3], output[,2], output[,1]),] which gives me the output I want. But is there a way to automate it, so that every time I have a new column, I don't need to hardwire inserting it.

Many thanks in advance


Solution

  • One way to order programmatically is to use do.call on a list of the ordering components (matrix columns).

    output[do.call(order, rev(asplit(output, 2))),]
    #      [,1] [,2] [,3] [,4]
    # [1,]    1    1    1    1
    # [2,]    2    1    1    1
    # [3,]    5    1    1    1
    # [4,]    3    2    1    1
    # [5,]    4    3    2    1
    

    Walk-through:

    • asplit(., 2) converts a matrix into a list, where MARGIN=2 means by-column. We could easily have just list-ized a subset of columns

    • since you want to prioritize them last-column-first, its easy-enough to reverse them; it's also easy enough to use a specific order, as in asplit(output,2)[c(4,2,3,1)];

    • (Note that we could have asplit and ordered in a different order, such as asplit(output[,4:1], 2) and no need for rev.)

    • that is a list, which is precisely what do.call needs as its second argument; if you aren't familiar with do.call, then the following are equivalent:

      L <- list(1:3, 4, 5:99)
      do.call(somefunc, L)
      somefunc(L[[1]], L[[2]], L[[3]])
      
    • since the do.call(order, ..) returns an integer vector of the desired order, we just pass that as the i= argument to [ for matrix row subset/order.