Search code examples
rselectdplyrmultiple-columnsorganization

Is there an easy way to order a large number of columns without using dplyr::select() in r?


I'm working with a very large dataset with lots of columns (400+) and every time I create a new variable or add a new one I have to reorder it. I want it ordered so that all the related variables remain together so I've been using dplyr::select() to reorder things. Yet there are times when I have to go back into my script very early on and add a new variable. When I run the whole code after that, there tends to be one or two variables I forgot to put into preceding select() functions so it goes missing.

I use select() because selecting all the columns between two variables and referencing them by name is super easy (eg, Vfour:Vthreefifty). Do you have any tips for reordering datasets with lots of columns?


Solution

  • Given no reproducible example but using your 2 column names:

    df %>%
    select(., starts_with('V'))
    

    You can then chain starts_with as needed.

    Other options include: ends_with, contains, matches