I need to sort multiple dataframes by a list of columns with non-alphabetic characters in their names. For a single dataset I'd use this famous solution with a workaround for blanks and stuff in the variable name:
df_sorted = df[with(df, order(varname, xtfrm(df[,"varname with blanks and\slashes"]) ) ), ]
But for multiple datasets it's more suitable to have a function with a vector of column names as an input:
sort_by_columns = function(col_names){...}
df_sorted = sort_by_columns(col_names = c("varname","varname with blanks and\slashes"))
How do I transform a vector into an argument suitable for order()
inside my function?
Without an example data set for your problem, I'll use the iris data as an example. Using dplyr and tidyeval would be my approach to this.
library(dplyr)
library(datasets)
data(iris)
# I'll rename one of the columns so that it has a space and a slash (slashes will
# need to be escaped to appear in column name
iris <- iris %>%
rename('sepal \\length' = 'Sepal.Length')
# Data will be sorted in the order listed
col_names <- c('sepal \\length', 'Sepal.Width')
data_sorted <- iris %>%
arrange(!!!syms(col_names))
To turn this into a function:
sort_by_columns <- function(data, col_names){
data_sorted <- data %>%
arrange(!!!syms(col_names))
return(data_sorted)
}