I have a for-loop that extracts equal-length numerical ranges and appends each range as a new list within a list. After the for-loop completes I want to turn the list of lists into a data frame and calculate the row means. I currently cannot convert the list of lists to a tibble, because the individual lists are not named.
How can I name each of the new lists as the integer associated with the loop iteration? (i.e. 1, 2, 3 ...)
Simplified example below. My actual for loop does something more complicated, but it captures the issue I'm having.
Thanks!
library(tidyverse)
input_df <- data.frame(
Name = c("A", "B", "C"),
Start = c(100,250,350),
Stop = c(150,300,400)
)
All_list <- list()
for(i in 1:length(input_df$Name)) {
myoutput <- data.frame(
Value=seq.int(paste0(input_df$Start[i]), paste0(input_df$Stop[i]))
)
#How do I specify that the name of added list is the value of the iteration [i]?
All_list[[i]] <- myoutput$Value
}
All_df <- All_list %>%
as_tibble() %>%
rowwise() %>%
dplyr::mutate(Average = mean(c_across(1:length(myvector)))) %>%
ungroup()
Error in `as_tibble()`:
! Columns 1, 2, and 3 must be named.
Use `.name_repair` to specify repair.
As you're loading tidyverse
, you can rewrite your loop using purrr:pmap()
to map over each row simultaneously. We can rename()
the columns to from
and to
so they are provided to seq.int()
by pmap()
:
out_df <- input_df |>
rename(from = Start, to = Stop) |>
pmap(seq.int) |>
set_names(input_df$Name) |>
as_tibble()
head(out_df, 3)
# # A tibble: 3 × 3
# A B C
# <int> <int> <int>
# 1 100 250 350
# 2 101 251 351
# 3 102 252 352
The output is a named list identical to the one created by your loop, where we can set_names()
to the Name
column of input_df
and make into a tibble()
.
I think you're then trying to calculate the average by row. You can use the base rowMeans()
function for this:
out_df %>%
mutate(Average = rowMeans(.))
# # A tibble: 51 × 4
# A B C Average
# <int> <int> <int> <dbl>
# 1 100 250 350 233.
# 2 101 251 351 234.
# 3 102 252 352 235.
Incidentally you can get these means in a one-liner in base R:
rowMeans(apply(input_df, 1, \(row) seq(row["Start"], row["Stop"])))
# Same output as Average column
In response to your comment, if you have to use a loop, simply set the names when you create the All_list
variable, prior to your loop. So rather than All_list <- list()
, you can do:
All_list <- vector(mode = "list", length = nrow(input_df)) |>
setNames(input_df$Name)
It is also better to create a list of the length you will need, avoiding the well-known performance bottleneck of growing a vector in R. See p.12 of The R Inferno. (Things have improved a bit since then but it's still more efficient to create a list of the required length when it is known in advance.)