Search code examples
rlistnested

Average of dataframes values in a list by name


I have a question related to R and nested lists. Let's assume I have a nested list with this structure:

library(tidyr)
library(purrr)

simul<-list(
  "Q"=tibble(
    months = seq(1,12,by=1),
    run1 = runif(12),
    run2 = runif(12),
    run3= runif(12)),
  "ET1"=tibble(
    months = seq(1,12,by=1),
    run1 = runif(12),
    run2 = runif(12),
    run3= runif(12)),
  "ET2"=tibble(
    months = seq(1,12,by=1),
    run1 = runif(12),
    run2 = runif(12),
    run3= runif(12)),
  "ET3"=tibble(
    months = seq(1,12,by=1),
    run1 = runif(12),
    run2 = runif(12),
    run3= runif(12))
  )

I am trying to obtain some averages by variables ("ET") and by run, thus maintaining the lower-level list structure. The variables are grouped by their starting characters, and vary by the number at the end.

So far I solved my problem in this way, however, I was wondering if you could please suggest to me a better way, which I can apply easily to a larger list, with more "variables" and "runs".

nested_avg <-list("Q"=simul["Q"],
                  "ET"=tibble(months = simul$ET1$months,
                              run1=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(2))),
                              run2=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(3))),
                              run3=rowMeans(do.call(cbind, simul[c("ET1","ET2","ET3")]%>% map(4)))
                  ))

Thank you so much for your answer.


Solution

  • Some enframe then deframe with grouping does the trick:

    simul %>%
      setNames(str_remove_all(names(.), "\\d")) %>%
      enframe() %>%
      group_by(name) %>%
      summarise(value = list(as_tibble(Reduce("+", value) / n()))) %>%
      deframe()
    

    Note that the as_tibble is necessary because Reduce automatically converts to matrix.