Search code examples
rsortingdplyrr-colnames

Correctly order/sort columns based on string in colnames tidyverse style in R


This is just a slice of a large dataframe that I have

dput(MyData)
structure(list(Frui1_Trea4_Ty4_0d = c(10L, 4L, 28L, 147L, 6L), 
    Frui1_Trea4_Ty4_14d = c(18L, 0L, 26L, 70L, 27L), Frui1_Trea4_Ty8_0d = c(9L, 
    1L, 21L, 168L, 6L), Frui1_Trea4_Ty8_14d = c(19L, 0L, 58L, 74L, 
    10L), Frui2_Trea4_Ty4_0d = c(40L, 6L, 39L, 141L, 15L), Frui2_Trea4_Ty4_14d = c(74L, 
    1L, 91L, 24L, 8L), Frui2_Trea4_Ty8_0d = c(22L, 0L, 50L, 54L, 
    17L), Frui2_Trea4_Ty8_14d = c(80L, 0L, 43L, 65L, 9L)), row.names = c("MTC88", 
"MTND2P28", "MTCO1P12", "MTATP6P1", "MTCO3P12"), class = "data.frame")

I have many other columns, but they keep the same "logic"

I've struggling because I just want to re-order the columns so that all the columns that have names finishing with "_0d" are arranged first in the data frame, and the ones that have "_14d" are left together at the end of the dataframe.

I've tried

MyData %>% dplyr::select(sort(names(.)))

which works if I wanted to arrange alphabetically, but when I try something like:

  MyData %>% dplyr::select(names(stringr::str_sort("d0", "d7")))

I just get an error. I suppose there's a turnaround with select(contains(.)) but I can't seem to get it right. Can anyone help? I have many more columns and since I am also filtering for further analysis, I want to do keep it the "tidyverse-way"


Solution

  • One solution is to use ends_with():

    myData %>% 
      dplyr::select(ends_with("_0d"), ends_with("_14d"))