Search code examples
rlistdataframeaggregatemode

unable to use aggregate() with data.frame of mode 'list'


Preliminary steps:

#======================
# added ‘height’ column to the in-built data.frame: CO2
height <- runif(84, 30.0, 44)
cbind(CO2, height)
#======================

Aggregating CO2 Data Frame yields correct results:

    > aggregate(cbind(height,uptake)~conc, CO2, mean)
      conc   height   uptake
    1   95 37.04813 12.25833
    2  175 38.14815 22.28333
    3  250 34.70362 28.87500
    4  350 32.81782 30.66667
    5  500 37.19268 30.87500
    6  675 36.16915 31.95000
    7 1000 37.33184 33.58333

Alternatively,
> aggregate(CO2[,cbind("height","uptake")], by = list(CO2$conc), FUN = mean)
  Group.1   height   uptake
1      95 37.04813 12.25833
2     175 38.14815 22.28333
3     250 34.70362 28.87500
4     350 32.81782 30.66667
5     500 37.19268 30.87500
6     675 36.16915 31.95000
7    1000 37.33184 33.58333

However, when I convert CO2 into a list:

> CO2list <- lapply(CO2, as.data.frame)
> summary(CO2list)
          Length Class      Mode
Plant     1      data.frame list
Type      1      data.frame list
Treatment 1      data.frame list
conc      1      data.frame list
uptake    1      data.frame list
height    1      data.frame list

With CO2list, however, I receive errors with the four aggregate() attempts below.

Question: how can I make aggregate work with CO2list which is a data.frame of Mode ‘list’?

> aggregate(cbind(height,uptake)~conc, CO2list, mean)
Error in model.frame.default(formula = cbind(height, uptake) ~ conc, data = CO2list) : 
  invalid type (list) for variable 'cbind(height, uptake)'

> aggregate(CO2list[,cbind("height","uptake")], by = list(CO2list$conc), FUN = mean)
Error in CO2list[, cbind("height", "uptake")] : 
  incorrect number of dimensions
> aggregate(cbind(height,uptake), by = list(CO2list$conc), FUN = mean)
Error in cbind(height, uptake) : object 'uptake' not found
> aggregate(cbind(CO2list$height,CO2list$uptake), by = list(CO2list$conc), FUN = mean)
Error in aggregate.data.frame(cbind(CO2list$height, CO2list$uptake), by = list(CO2list$conc),  : 
  arguments must have same length

Thanks


Solution

  • It is a list with single column data.frame and the names are also changed. One option is to convert it back to a single data.frame by cbinding the list elements and then apply the aggregate

    newDat <- setNames(do.call(cbind, CO2list), names(CO2list))
    aggregate(cbind(height,uptake)~conc, newDat, mean)
    #  conc   height   uptake
    #1   95 39.15248 12.25833
    #2  175 35.38677 22.28333
    #3  250 38.56924 28.87500
    #4  350 37.73494 30.66667
    #5  500 35.37963 30.87500
    #6  675 36.26344 31.95000
    #7 1000 36.43538 33.58333
    

    Or extract the list elements and use that in aggregate

    aggregate(cbind(height, uptake = CO2list[["uptake"]][[1]]), 
          list(conc = CO2list[["conc"]][[1]]), FUN = mean)
    #  conc   height   uptake
    #1   95 39.15248 12.25833
    #2  175 35.38677 22.28333
    #3  250 38.56924 28.87500
    #4  350 37.73494 30.66667
    #5  500 35.37963 30.87500
    #6  675 36.26344 31.95000
    #7 1000 36.43538 33.58333