Run a pop-gen simulation several times and store the results of each in a new column on a data frame

I have a basic Wright-Fisher simulation for two alleles that works wonderfully, and produces a good looking plot showing the allele fixing or dying out by chance as expected. It exports every generation as calculated into a data frame d, so I have the values at each generation to hand. What I want to do is run the whole thing say 20 times and store each complete simulation in a new column automatically, so I can plot all of them on a ggplot graph with colours and all that good stuff. I'm mostly interested in getting a tidy frame to make good-looking plots for a project rather than breakneck efficiency.

#Wright Fisher model Mk1

#Simulation Parameters
# n = pop.size
# f = frequency of focal allele
# x = number of focal allele, do not set by hand
# y = number of the other allele, do not set by hand
# g = generations desired
n = 200
f = 0.6
x = (n*f)
y = (n-x)
g = 200

#This creates a data frame of the correct size to store each generation

d = data.frame(f = rep(0,g))

#Creates the graph.
plot(1,0, type = "n", xlim = c(1,200), ylim = c(0,n),
     xlab = "Generation", ylab = "Frequency A")

#Creates the population, this model is limited to only two alleles, and can only plot one
alleles<- c(rep("A",x), rep("a",y))

#this is the loop that actually simulates the population
#It has code for plotting each generation on the graph as a point 
#Exports the number of focal allele A to the data frame
for (i in 1:g){ 
  alleles <- sample(alleles, n, replace = TRUE)
points(i, length(alleles[alleles=="A"]), pch = 19, col= "red")
F = sum(alleles == "A")
d[i, ] = c(F)
}

So I want to run that last bit multiple times and store each complete iteration somehow. I know I could loop the function by nesting it, even though this is quick and dirty, but doing this only stores the values of the last iteration of the outer loop.

Solution

There's a lot of opportunity for improvement here, but this should get you going. I am only showing five simulations, but you should be able to extend. In essence, place the bulk of your code a function and then you can use either map functions from the purrr package or you could also do something with replicate:

library(tidyverse)

n = 200
f = 0.6
x = (n*f)
y = (n-x)
g = 200

d = data.frame(f = rep(0,g))

run_sim <- function() {
  alleles <- c(rep("A", x), rep("a", y))

  for (i in 1:g) { 
    alleles <- sample(alleles, n, replace = TRUE)
    cnt_A = sum(alleles == "A")
    d[i, ] = c(cnt_A)
  }

  return(d)
}

sims <- paste0("sim_", 1:5)

set.seed(4) # for reproducibility

sims %>%
  map_dfc(~ run_sim()) %>%
  set_names(sims) %>%
  gather(simulation, results) %>%
  group_by(simulation) %>%
  mutate(period = row_number()) %>%
  ggplot(., aes(x = period, y = results, group = simulation, color = simulation)) +
  geom_line()

^{Created on 2019-03-21 by the reprex package (v0.2.1)}

NOTE: You could also add arguments to the run_sim function for say x and y (i.e., run_sim <- function(x, y) { ... }), which would allow you to explore other possibilities.