Search code examples
rsortingfiltertidyverse

Select the last sequentially numbered column from a dataframe in R where columns are not ordered


Suppose I have the following code:

> set.seed(25)
>
> x = data.frame(col1 = rnorm(5),
                 col2 = rnorm(5),
                 col3 = rnorm(5),
                 col4 = rnorm(5),
                 col5 = rnorm(5),
                 other_data1 = sample(letters, 5),
                 other_data2 = sample(letters, 5),
                 other_data3 = sample(letters, 5))
>
> x = x[ ,sample(names(x))]
> x
  other_data3       col4       col1        col3 other_data2       col5        col2 other_data1
1           s  0.9275789 -0.2118336 -1.74278763           l  0.5574109 -0.44553326           i
2           l -0.7167693 -1.0415911 -1.32495298           h  0.1687815  1.73404543           u
3           y  0.9623997 -1.1533076 -0.54793388           i  0.1552589  0.51129562           r
4           g  1.5458846  0.3215315 -1.45638428           b  2.3677647  0.09964504           p
5           z -1.0097636 -1.5001299  0.08268682           x -1.5856444 -0.05789111           n

I would like to select column colN, where N is the last numbered column for arbitrary N (in this example, it would be column col5).

Is there a way to do this in R, either using the functions in base R or using tidyverse filters?


Solution

  • A solution for any N

    col_names <- grep("col", names(x), value = TRUE)
    ind <- as.numeric(sub("\\D+", "", col_names)  )
    x[col_names[which.max(ind)]]
            col5
    1  0.5574109
    2  0.1687815
    3  0.1552589
    4  2.3677647
    5 -1.5856444