I have data.frame with the column names shown in the example. V1.1 = test series 1, day 1, K1-K3 are three unique samples. I want to order my columns by days, so that the order would be V1.1, V1.2, V1.14, V1.21. I think that this is no big problem, but I can't find the correct answer for my problem in the forum.
A tibble: 6 x 168
genus V1.1.K1 V1.1.K2 V1.1.K3 V1.14.K16 V1.14.K17 V1.14.K18 V1.2.K4 V1.2.K5 V1.2.K6 V1.21.K19 V1.21.K20 V1.21.K21
<chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 A17 58 7 78 0 0 0 0 0 0 0 0 0
2 Acan~ 0 0 0 0 0 0 0 0 0 0 0 0
3 Acet~ 8 0 5 155 256 341 0 28 26 188 142 83
4 Achr~ 0 0 0 0 0 0 0 0 0 0 0 0
5 Acid~ 0 0 7 0 0 0 0 83 6 43 13 0
6 Acid~ 0 0 0 0 0 0 0 0 0 0 0 0
First, I tried to solve the problem with
select(1:3, 7:9, 4:5, 11:13)
But then it is not reproducible to other data.frames, I am working with, because there the amount of colums or the days are different.
Or is there an argument to add to my data-import, which is done by
read_tsv()
? You would help me a lot :) Kathrin
Since you want them ordered by the numbers, disregarding the letters (or lexicographic sorts), then here's a way. Since you're using select
, I'll infer you have %>%
and others.
library(magrittr) # %>%, not needed if you have dplyr loaded
library(stringr) # str_extract_all
ord <- str_extract_all(names(dat), "\\d+") %>%
lapply(., `length<-`, max(lengths(.))) %>%
lapply(., as.integer) %>%
do.call(rbind.data.frame, .) %>%
do.call(function(...) order(..., na.last = FALSE), .)
ord
# [1] 1 2 3 4 8 9 10 5 6 7 11 12 13
dat[,ord] # or select(dat, all_of(ord))
# genus V1.1.K1 V1.1.K2 V1.1.K3 V1.2.K4 V1.2.K5 V1.2.K6 V1.14.K16 V1.14.K17 V1.14.K18 V1.21.K19 V1.21.K20 V1.21.K21
# 1 A17 58 7 78 0 0 0 0 0 0 0 0 0
# 2 Acan~ 0 0 0 0 0 0 0 0 0 0 0 0
# 3 Acet~ 8 0 5 0 28 26 155 256 341 188 142 83
# 4 Achr~ 0 0 0 0 0 0 0 0 0 0 0 0
# 5 Acid~ 0 0 7 0 83 6 0 0 0 43 13 0
# 6 Acid~ 0 0 0 0 0 0 0 0 0 0 0 0
The na.last=FALSE
forces (in this case) genus
to be first. If you have various names with and without numbers, you may get inconsistent results, in which case you may need to subset the column names you use with this.
Data
dat <- structure(list(genus = c("A17", "Acan~", "Acet~", "Achr~", "Acid~", "Acid~"), V1.1.K1 = c(58L, 0L, 8L, 0L, 0L, 0L), V1.1.K2 = c(7L, 0L, 0L, 0L, 0L, 0L), V1.1.K3 = c(78L, 0L, 5L, 0L, 7L, 0L), V1.14.K16 = c(0L, 0L, 155L, 0L, 0L, 0L), V1.14.K17 = c(0L, 0L, 256L, 0L, 0L, 0L), V1.14.K18 = c(0L, 0L, 341L, 0L, 0L, 0L), V1.2.K4 = c(0L, 0L, 0L, 0L, 0L, 0L), V1.2.K5 = c(0L, 0L, 28L, 0L, 83L, 0L), V1.2.K6 = c(0L, 0L, 26L, 0L, 6L, 0L), V1.21.K19 = c(0L, 0L, 188L, 0L, 43L, 0L), V1.21.K20 = c(0L, 0L, 142L, 0L, 13L, 0L), V1.21.K21 = c(0L, 0L, 83L, 0L, 0L, 0L)), class = "data.frame", row.names = c("1", "2", "3", "4", "5", "6"))