Search code examples
rstatistics-bootstrap

R: Taking the mean of an element in data frames contained in a list


I've assembled a list of data frames that contain the coefficients of nls. This is part of a custom bootstrapping (actually, bagging) method. I'd like to calculate the mean of each of the parameters in the data frames.

After sampling xdata and ydata, the list is populated in a loop containing:

nls(ydata ~ A*cos(2*pi*((xdata-x_0)/z))+M,start=list(A=4,M=-7,x_0=-10,z=30))
fitdata = summary(fit)$coefficients
fitresults[[i]] = fitdata

The list contains 100 data frames, as below:

     Estimate Std. Error    t value     Pr(>|t|)
A    3.945959  0.1729441  22.816381 3.440064e-14
M   -8.349697  0.1656195 -50.414926 5.920106e-20
x_0 -3.677582  0.5717355  -6.432313 6.194560e-06
z   33.680613  1.1314373  29.767989 4.158598e-16

I'd like to calculate the mean of each element in the first column across the list. So, the A, M, x_0, and z of the Estimate.

I've played around a bit with dply functions, but I cannot get it.

Many thanks!


Solution

  • To reproduce my example I just create a list called mylist containing two dataframes. There are a lot of possibilities to get the first column out, one is to sapply over the list, something like this:

    set.seed(1)
    mylist <- list(list1 = data.frame(matrix(rnorm(16), 
                                     ncol = 4, 
                                     nrow = 4, 
                                     dimnames = list(row = c("A", "M", "x_0", "z"),
                                                     column = c("Estimate", 
                                                                "Std.Error", 
                                                                "t_Value", 
                                                                "Pr(<|t|)")))),
    
               list2 = data.frame(matrix(rnorm(16), 
                                         ncol = 4, 
                                         nrow = 4, 
                                         dimnames = list(row = c("A", "M", "x_0", "z"),
                                                         column = c("Estimate", 
                                                                    "Std.Error", 
                                                                    "t_Value", 
                                                                    "Pr(<|t|)")))))
    
    mylist
    
    $list1
          Estimate  Std.Error    t_Value    Pr...t..
    A   -0.6264538  0.3295078  0.5757814 -0.62124058
    M    0.1836433 -0.8204684 -0.3053884 -2.21469989
    x_0 -0.8356286  0.4874291  1.5117812  1.12493092
    z    1.5952808  0.7383247  0.3898432 -0.04493361
    
    $list2
          Estimate   Std.Error     t_Value   Pr...t..
    A   -0.01619026  0.91897737  0.61982575 -0.4781501
    M    0.94383621  0.78213630 -0.05612874  0.4179416
    x_0  0.82122120  0.07456498 -0.15579551  1.3586796
    z    0.59390132 -1.98935170 -1.47075238 -0.1027877
    

    and then:

    result <- data.frame(Estimates = apply(sapply(mylist, function(x) x[, "Estimate"]), 1, mean))
    rownames(result) <- c("A", "M", "x_0", "z")
    
    result
          Estimates
    A   -0.321322037
    M    0.563739767
    x_0 -0.007203709
    z    1.094591062