I have this big dataframe, with species in rows and samples in columns. There are 30 samples, with 12 replicates each. The column names are written as such : sample.S1.01; sample.S1.02.....sample.S30.11; sample.S30.12.
I would like to create 30 new tables containing the 12 replicates for each samples.
I have this command line that works perfectly for one sample at a time :
dt<- tab_sp_sum %>%
select(starts_with("sample.S1."))
assign(paste("tab_sp_1"), dt)
But when I put this in a for loop, it doesn't work anymore. I think it's due to the fact that the variable i is included in the starts_with quotation, and I don't know how to write it.
for (i in 1:30){
dt<- tab_sp_sum %>%
select(starts_with("sample.S",i,".", sep=""))
assign(paste("tab_sp",i,sep="_"), dt)
although the last line works well, 30 tables are created with the right names, but they are empty.
Any suggestion ?
Thank you
Instead of using assign
and store it in different objects try to use list . Create the names that you want to select
using paste0
and then use map
to create list of dataframes.
library(dplyr)
library(purrr)
df_names <- paste0("sample.S", 1:30, ".")
df1 <- map(df_names, ~tab_sp_sum %>% select(starts_with(.x)))
You can then use df1[[1]]
, df1[[2]]
to access individual dataframes.
In base R, we can use lapply
by creating a regex to select columns that starts with df_names
df1 <- lapply(df_names, function(x)
tab_sp_sum[grep(paste0("^", x), names(tab_sp_sum))])
Using it with built-in iris
dataset
df_names <- c("Sepal", "Petal")
df1 <- map(df_names, ~iris %>% select(starts_with(.x)))
head(df1[[1]])
# Sepal.Length Sepal.Width
#1 5.1 3.5
#2 4.9 3.0
#3 4.7 3.2
#4 4.6 3.1
#5 5.0 3.6
#6 5.4 3.9
head(df1[[2]])
# Petal.Length Petal.Width
#1 1.4 0.2
#2 1.4 0.2
#3 1.3 0.2
#4 1.5 0.2
#5 1.4 0.2
#6 1.7 0.4