Search code examples
rperformancesimulationsapplyprocessing-efficiency

Should I use sapply to run simulations that don't require an argument?


I am running a simulation thousands of times that does not require any arguments. Here is a very simple example:

simulate <- function() sum(sample(1:10, size = 5))

I could run

results <- rep(0,1000)
for(i in 1:1000){
  results[i] <- simulate()
}

...but I've read many times that for loops are slow in R, and I need to maximize speed (the actual simulation I am doing is much more time intensive than this).

  1. Should I use a member of the apply family on results and if so how?
  2. Is sapply still faster than a for loop if the elements of results aren't being used in the simulate function?

Solution

  • You can use sapply for this but usually for such cases I prefer replicate.

    set.seed(123)
    replicate(10, simulate())
    #[1] 29 24 27 29 29 19 22 31 28 23
    

    You can also use rerun in purrr which behaves the same way as replicate.


    Using sapply the way would be with an anonymous function.

    sapply(1:10, function(X) simulate())