Search code examples
rlistnested-lists

How sum embedded dataframe columns in a dynamic R list?


I am learning to work with lists.

How would I add all columns in a dynamic list with embedded dataframes, where for example I want to sum all columns named First.allocation in all sublists whose names begin with Syria? For example, in List1 below, the function or formula would return a vector of (7,9,11,13,15) representing the columnar sums of the sublist Syria_One and sublist Syria_Two "First.allocation" columns.

As another example, in List2 below, the function or formula would return a vector of (1,2,3,4,5) representing the First.allocation column of the only sublist in the list that begins with "Syria".

List1 <- list(
  Syria_One = data.frame(
    First.allocation = c(1,2,3,4,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  ),
  Syria_Two = data.frame(
    First.allocation = c(6,7,8,9,10),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  ),
  Tunisia = data.frame(
    First.allocation = rep(0.2,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  ),
  Total = data.frame(
    First.allocation = rep(1,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  )
)

List2 <- list(
  Syria_One = data.frame(
    First.allocation = c(1,2,3,4,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  ),
  Tunisia = data.frame(
    First.allocation = rep(0.2,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  ),
  Total = data.frame(
    First.allocation = rep(1,5),
    Second.numerator = rep(0,5),
    Second.allocation = rep(0,5)
  )
)

Solution

  • To deal with this dynamically, you can define a function with 2 parameters:

    • regex: the regular expression for list names
    • col: the column names (or indices)
    fun <- function(lst, regex = "^Syria", col = 1) {
      Reduce(`+`, lapply(lst[grepl(regex, names(lst))], `[[`, col))
    }
    
    fun(List1)
    # [1]  7  9 11 13 15
    
    fun(List2)
    # [1] 1 2 3 4 5
    

    If you want to operate other lists except "Syria", or other columns except "First.allocation", you can modify these parameters when calling fun().