Search code examples
rsortingdata.table

`mixedorder` sorting of a data.table with vector of column names as input


This is my sample data:

library(data.table)
set.seed(47)
colsize <- 10
DT <- data.table(sortcol1 = as.numeric(sample(CJ(1:27,1:27,1:27)[,paste0(V1,V2,V3)],colsize)),
               sortcol2 = sample(CJ(c( LETTERS, " "),c(1:27),c(letters," "))[,paste0(V1,V2,V3)],colsize),
               sortcol3 = sample(CJ(c( LETTERS, " "),c(1:27),c(letters," "))[,paste0(V1,V2,V3)],colsize),
               sortcol4 = sample(CJ(c( LETTERS, " "),c(1:27),c(letters," "))[,paste0(V1,V2,V3)],colsize),
               sortcol5 = sample(CJ(c( LETTERS, " "),c(1:27),c(letters," "))[,paste0(V1,V2,V3)],colsize),
               stringsAsFactors = F)

DT
#     sortcol1 sortcol2 sortcol3 sortcol4 sortcol5
#        <num>   <char>   <char>   <char>   <char>
#  1:   241420     F12e     V10y     D23v     J10b
#  2:    71714      P1j     Q11y      D3s     F11e
#  3:    18617      E3w     U18t      25n     L14r
#  4:     4254     A12i     P15f     I27h      L9m
#  5:   131325     K27w      E8j      B6q       1q
#  6:     1958      Q1k     H23u     A18f     P23 
#  7:   151619      C1y      R7i      27q     U26l
#  8:     9262      C9n     N15o     P19q      I7t
#  9:    11518      C8l     P22a      E5g     X18z
# 10:     3224     Z26b     X17b      Y5z     Q12k

This is a user input from another source:

input = c("sortcol2", "sortcol1");

This solves the purpose but I need it to be dynamic incorporating the object input

library(gtools)
DT[mixedorder(sortcol2, sortcol1 , decreasing = FALSE)]

DT has a mix of numeric, characters or both. some columns can be entirely numeric.

input can be of length at least 1. It can be in any order.

## sample input values
input <- c("sortcol3","sortcol1")
input <- c("sortcol5")
input <- c("sortcol1","sortcol5","sortcol3") # etc

data.table::setorderv does not take into account mixed alpha and numeric values in a vector.

I need this to be in data.table.


Solution

  • Possible solution

    This function seems to be working well. I have edited it to incorporate input.

    multi.mixedorder <-\(data, cols, na.last = TRUE, decreasing = FALSE) {
      do.call(order, c(
        lapply(cols, function(col) {
          if (is.character(data[[col]])) {
            factor(data[[col]], levels = mixedsort(unique(data[[col]])))
          } else {
            data[[col]]
          }
        }),
        list(na.last = na.last, decreasing = decreasing)
      ))
    }
    
    DT[multi.mixedorder(DT, input),]