Suppose I have the following code:
> set.seed(25)
>
> x = data.frame(col1 = rnorm(5),
col2 = rnorm(5),
col3 = rnorm(5),
col4 = rnorm(5),
col5 = rnorm(5),
other_data1 = sample(letters, 5),
other_data2 = sample(letters, 5),
other_data3 = sample(letters, 5))
>
> x = x[ ,sample(names(x))]
> x
other_data3 col4 col1 col3 other_data2 col5 col2 other_data1
1 s 0.9275789 -0.2118336 -1.74278763 l 0.5574109 -0.44553326 i
2 l -0.7167693 -1.0415911 -1.32495298 h 0.1687815 1.73404543 u
3 y 0.9623997 -1.1533076 -0.54793388 i 0.1552589 0.51129562 r
4 g 1.5458846 0.3215315 -1.45638428 b 2.3677647 0.09964504 p
5 z -1.0097636 -1.5001299 0.08268682 x -1.5856444 -0.05789111 n
I would like to select column colN
, where N
is the last numbered column for arbitrary N
(in this example, it would be column col5
).
Is there a way to do this in R, either using the functions in base R or using tidyverse
filters?
A solution for any N
col_names <- grep("col", names(x), value = TRUE)
ind <- as.numeric(sub("\\D+", "", col_names) )
x[col_names[which.max(ind)]]
col5
1 0.5574109
2 0.1687815
3 0.1552589
4 2.3677647
5 -1.5856444