Search code examples
rfor-loopsimulationcorrelationmass

How to create a for loop of simulated correlation coefficients in R?


I plan to run 10,000 simulations on an existing dataset, and want to find the correlation coefficient of each simulation and save it into a dataframe (theoretically ending up with 10,000 correlation coefficients). How can I write code to run the simulation and print out the coefficient each time in a for loop?

This is the latest code I tried:

for (i in 1:10000) {
  bnsamp <- data.frame(mvrnorm(n = 48, mu = mean_combined, Sigma = sigma))
  colnames(bnsamp) <- c("Amyg_Avg", "Sys_Jus")
  cor_coeff <- cor.test(x= bnsamp$Amyg_Avg, y= bnsamp$Sys_Jus, 
                        method= "pearson", alternative= "two.sided")
   cor_coef_est <- list(cor_coeff$estimate == i)
   cor_coef_p <- list(cor_coeff$p.value == i)
   coeff_list <- data.frame(cor_coef_est, cor_coef_p)
   print(coeff_list)
}

Ultimately, my problem is how to loop the recalculation of the correlation and print it every single time (for later to be plotted into a distribution). Thank you!


Solution

  • Here are two ways of simulating the wanted values.

    1. With a for loop:
    2. with replicate

    From help('replicate'), my emphasis:

    replicate is a wrapper for the common use of sapply for repeated evaluation of an expression (which will usually involve random number generation).

    library(MASS)
    
    mean_combined <- c(1, 2)
    sigma <- matrix(c(2, 1, 3, 2), ncol = 2L)
    
    # replicates, but not the 10,000 in the question,
    # only 1,000 is enough for code tests
    R <- 1000L
    
    # with a for loop:
    # 1. create the results matrix beforehand
    # 2. simulate from multivariate normal
    # 3. run the test
    # 4. and assign the results
    set.seed(2023)
    coeff_mat <- matrix(nrow = R, ncol = 2,
                        dimnames = list(NULL, c("Amyg_Avg", "Sys_Jus")))
    
    for(i in seq.int(R)) {
      bnsamp <- mvrnorm(n = 48L, mu = mean_combined, Sigma = sigma)
      cor_coeff <- cor.test(x = bnsamp[, 1L], y = bnsamp[, 2L], 
                            method= "pearson", alternative= "two.sided")
      cor_coef_est <- cor_coeff$estimate
      cor_coef_p <- cor_coeff$p.value
      coeff_mat[i, ] <- c(cor_coef_est, cor_coef_p)
    }
    
    # with help('replicate')
    # 1. simulate from multivariate normal
    # 2. run the test
    # 3. and assign the results
    # 4. the final pipe to t() puts the matrix in the wanted form
    set.seed(2023)
    res <- replicate(R, {
      bnsamp <- mvrnorm(n = 48L, mu = mean_combined, Sigma = sigma)
      cor_coeff <- cor.test(x = bnsamp[, 1L], y = bnsamp[, 2L], 
                            method= "pearson", alternative= "two.sided")
      c(Amyg_Avg = unname(cor_coeff$estimate), Sys_Jus = cor_coeff$p.value)
    }) |> t()
    
    identical(res, coeff_mat)
    #> [1] TRUE
    
    head(res)
    #>       Amyg_Avg      Sys_Jus
    #> [1,] 0.3999366 4.856605e-03
    #> [2,] 0.6007393 6.353057e-06
    #> [3,] 0.6641624 2.652077e-07
    #> [4,] 0.6094018 4.287226e-06
    #> [5,] 0.4825884 5.132400e-04
    #> [6,] 0.5514744 4.854691e-05
    

    Created on 2023-11-06