Search code examples
rdplyrtidyselect

Drop column range starting with first column matching regex


I have the following dataframe, which is the output of read_excel with missing column names in excel:

t <- tibble(A=rnorm(3), B=rnorm(3), "x"=rnorm(3), "y"=rnorm(3), Z=rnorm(3))
colnames(t)[3:4] <-  c("..3", "..4")

How can I select columns ..3 to Z in a flexible dynamic way (not depending on number or table width). I was thinking in the direction of something like:

t %>% select(-starts_with(".."):-last_col())

But this gives a warning, since starts_with returns two values.


Solution

  • We could force to select the first one:

    t %>% select(-c(starts_with("..")[ 1 ]:last_col()))
    # # A tibble: 3 x 2
    #       A      B
    #   <dbl>  <dbl>
    # 1 0.889  0.505
    # 2 0.655 -2.15 
    # 3 1.34  -0.290
    

    Or "tidier" way use first:

    select(-first(starts_with("..")):-last_col())