Search code examples
rtidyselect

Select columns using starts_with()


With select(starts_with("A") I can select all the columns in a dataframe/tibble starting with "A".

But how can I select all the columns in a dataframe/tibble starting with one of the letters in a vector?

Example:

columns_to_select <- c("A", "B", "C")
df %>% select(starts_with(columns_to_select))

I would like to select A1, A2, A3... and B1, B2, B3, ... and C1, C2, Cxy...


Solution

  • This currently seems to be working the way you're describing:

    library(tidyverse)
    
    df <- tibble(A1 = 1:10, B1 = 1:10, C3 = 21:30, D2 = 11:20)
    
    columns_to_select <- c("A", "B", "C")
    
    df |> 
      select(starts_with(columns_to_select))
    #> # A tibble: 10 × 3
    #>       A1    B1    C3
    #>    <int> <int> <int>
    #>  1     1     1    21
    #>  2     2     2    22
    #>  3     3     3    23
    #>  4     4     4    24
    #>  5     5     5    25
    #>  6     6     6    26
    #>  7     7     7    27
    #>  8     8     8    28
    #>  9     9     9    29
    #> 10    10    10    30
    

    Do you mean to select only by one of the letters at a time? (you can use columns_to_select[1] for this) Apologies if I've misunderstood the question - can delete this response if not relevant.