Search code examples
rdataframeshinydatatable

Sorting factor columns of a datatable in R using factor values


I'm working in R and I have different dataframes all with different / multiple columns defined as factor. Then I visualize these dataframes as datatables inside my R Shiny App. When I sort any of the columns of type factor, the sorting is alphabetical and it doesn't take into account the actual factor values. Hence, I'm trying to create a function like the one below such that it checks if the input dataframe has factor columns and automatically fix their sorting 'issue' such that they are not sorted alphabetically but instead using the actual factor values. I found an interesting question here Ordering factors in data table using DT package however, this would be for one column only. Would you know how to fix my attempt below?

library(shiny)
library(DT)

# Define a function to create a datatable with factor columns sorted by factor values
create_datatable <- function(df) {
  # Identify the factor columns
  factor_cols <- sapply(df, is.factor)
  
  # Create a list of column definitions for factor columns
  factor_defs <- lapply(which(factor_cols), function(i) {
    list(targets = i-1, type = 'natural')
  })
  
  # Create the datatable with column definitions
  DT::datatable(df, 
                options = list(orderClasses = TRUE),
                rownames = FALSE,
                class = 'cell-border stripe',
                columnDefs = factor_defs
  )
}


# Define a sample dataframe with factor columns
df <- data.frame(
  x = factor(c("A", "B", "C")),
  y = factor(c("Low", "High", "Medium")),
  z = factor(c("Yes", "No", "Maybe")),
  a = c(1, 2, 3),
  b = c(4, 5, 6)
)

# Define the UI
ui <- fluidPage(
  DT::dataTableOutput("mytable")
)

# Define the server
server <- function(input, output) {
  output$mytable <- DT::renderDataTable({
    # Create the datatable with factor columns sorted by factor values
    create_datatable(df)
  })
}

# Run the app
shinyApp(ui, server)

Solution

  • If we look at the page for columnDefs and then convert their example to an R structure, we can see what type of nested-list it is looking for:

    str(jsonlite::fromJSON('[
            { "targets": [0, 1], "visible": true},
            { "targets": "_all", "visible": false }
        ]'))
    # 'data.frame': 2 obs. of  2 variables:
    #  $ targets:List of 2
    #   ..$ : int  0 1
    #   ..$ : chr "_all"
    #  $ visible: logi  TRUE FALSE
    

    whereas your list is

    str(factor_defs)
    # List of 3
    #  $ x:List of 2
    #   ..$ targets: num 0
    #   ..$ type   : chr "natural"
    #  $ y:List of 2
    #   ..$ targets: num 1
    #   ..$ type   : chr "natural"
    #  $ z:List of 2
    #   ..$ targets: num 2
    #   ..$ type   : chr "natural"
    

    Further, it needs all of that to be within another list, and then applied within the options= list.

    We can fix it by using this instead of your lapply version:

    create_datatable <- function(df) {
      # Identify the factor columns
      factor_cols <- sapply(df, is.factor)
      if (any(factor_cols)) {
        # Create a list of column definitions for factor columns
        factor_defs <- do.call(
          c,
          unname(Map(function(orig, lvl) list(list(targets=orig, orderData=lvl),
                                              list(targets=lvl, visible=FALSE)),
                     which(factor_cols) - 1, ncol(df) + seq_len(sum(factor_cols)) - 1))
        )
        df <- cbind(df, lapply(df[,factor_cols,drop=FALSE], function(z) match(z, levels(z))))
      } else factor_defs <- NULL
      # Create the datatable with column definitions
      DT::datatable(df, 
                    options = list(orderClasses = TRUE, columnDefs = factor_defs),
                    rownames = FALSE,
                    class = 'cell-border stripe'
      )
    }
    

    Demonstration:

    tmp <- mtcars # no factors
    create_datatable(tmp)
    # sort by `cyl`
    

    mtcars, no factors, sort by cyl, is a natural sort

    tmp$cyl <- factor(tmp$cyl, levels = c("6", "4", "8"))
    create_datatable(tmp)
    

    mtcars, cyl is a factor, sort by cyl, ordered by levels