Search code examples
raverage

creating a loop for a simulation with linked variables in r


I'm simulating species richness based on extinction and speciation rates using the sim.bd function from the geiger package. I need to run the simulation 1000 times. When you run it once you get 3 columns of data:

    time   n
[1]   1    6
[2] .....

What I want after the simulation has run 1000 times is the average n for each time. I know to create a loop to get this far:

c.sprich<-c()   
for (i in 100) 
{
pop.carn <- sim.bd(b=0.39, d=0.55, n0=4, times=66)
c.sprich[i]<- ##I don't know what command to put here to get the average n linked to each time 
}

Any ideas?

I've tried functions like length() but that results in the sum of everything and not what I want


Solution

  • I suggest a base-R approach based on replicate() and rowMeans() like so:

    library(geiger)
    set.seed(1) # to make results reproducible
    sim <- replicate(1000, sim.bd(b = .39, d = .55, n0 = 4L, times = 1:66L)[, "n"]) |>
      {\(x) cbind("time" = 0L:66L, "AveragePerN" = round(rowMeans(x), digits = 3L)) }()
    

    gives

    > head(sim)
         time AveragePerN
    [1,]    0       4.000
    [2,]    1       3.131
    [3,]    2       2.672
    [4,]    3       2.306
    [5,]    4       1.978
    [6,]    5       1.624
    

    (1) replicate() "is a loop" which executes sim.bd() 1,000 times. As we are interested in the n, we only collect those from each sim.bd()-run. We do not want the time column to be printed 1,000 times, since nothing changes there. (2) We then column-bind the vector of times with the row means. Additionally, we round those averages returned from rowMeans() by 3 digits. This, of course, is optional.