Search code examples
rdplyrchain

Order column names in ascending order within dplyr chain


I have this data.frame:

df <- structure(list(att_number = structure(1:3, .Label = c("0", "1", 
                                                      "2"), class = "factor"), `1` = structure(c(2L, 3L, 1L), .Label = c("1026891", 
                                                                                                                         "412419", "424869"), class = "factor"), `10` = structure(c(2L, 
                                                                                                                                                                                    1L, 3L), .Label = c("235067", "546686", "92324"), class = "factor"), 
               `2` = structure(c(3L, 1L, 2L), .Label = c("12729", "7569", 
                                                         "9149"), class = "factor")), .Names = c("att_number", "1", 
                                                                                                 "10", "2"), row.names = c(NA, -3L), class = "data.frame")    

It looks like this having numbers as the column names.

att_number  1         10        2
         0  412419    546686    9149
         1  424869    235067    12729
         2  1026891   92324     7569

Within a dplyr chain, I would like to order the columns in ascending order, like this:

att_number  1       2      10
         0  412419  9149   546686
         1  424869  12729  235067
         2  1026891 7569   7569

I've tried using select_, but it doesn't want to work according to plan. Any idea on how I can do this? Here's my feeble attempt:

names_order <- names(df)[-1] %>%
  as.numeric %>%
  .[order(.)] %>%
  as.character %>%
  c('att_number', .)

df %>%
  select_(.dots = names_order)

Error: Position must be between 0 and n

Solution

  • Update:

    For newer versions of dplyr (>= 0.7.0):

    library(tidyverse)
    
    sort_names <- function(data) {
      name  <- names(data)
      chars <- keep(name, grepl, pattern = "[^0-9]") %>% sort()
      nums  <- discard(name, grepl, pattern = "[^0-9]") %>% 
        as.numeric() %>% 
        sort() %>% 
        sprintf("%s", .)
    
      select(data, !!!c(chars, nums))
    }
    
    sort_names(df)
    

    Original:

    You need back ticks around the numeric column names to stop select from trying to interpret them as column positions:

    library(tidyverse)
    
    sort_names <- function(data) {
      name  <- names(data)
      chars <- keep(name, grepl, pattern = "[^0-9]") %>% sort()
      nums  <- discard(name, grepl, pattern = "[^0-9]") %>% 
                 as.numeric() %>% 
                 sort() %>% 
                 sprintf("`%s`", .)
    
      select_(data, .dots = c(chars, nums))
    }
    
    sort_names(df)