Search code examples
rdata.table

How to sort a data.table using a target vector


So, I have the following data.table

DT = data.table(x=rep(c("b","a","c"),each=3), y=c(1,2,3))

> DT
   x y
1: b 1
2: b 2
3: b 3
4: a 1
5: a 2
6: a 3
7: c 1
8: c 2
9: c 3

And I have the following vector

k <- c("2","3","1")

I want to use k as a target vector to sort DT using y and get something like this.

> DT
   x y
1: b 2
2: a 2
3: c 2
4: b 3
5: a 3
6: c 3
7: b 1
8: a 1
9: c 1

Any ideas? If I use DT[order(k)] I get a subset of the original data, and that isn't what I am looking for.


Solution

  • Throw a call to match() in there.

    DT[order(match(y, as.numeric(k)))]
    #    x y
    # 1: b 2
    # 2: a 2
    # 3: c 2
    # 4: b 3
    # 5: a 3
    # 6: c 3
    # 7: b 1
    # 8: a 1
    # 9: c 1
    

    Actually DT[order(match(y, k))] would work as well, but it is probably safest to make the arguments to match() of the same class just in case.

    Note: match() is known to be sub-optimal in some cases. If you have a large number of rows, you may want to switch to fastmatch::fmatch for faster matching.