Search code examples
rpca

Looping over sampling in R to create many vectors


I am doing some PCA analysis using the following R code:

sigma1 <- as.matrix((data[,3:22]))
sigma2 <- as.matrix((data[,23:42]))
sample1 <- mvrnorm(n = 250, mu = as_vector(data[,1]), Sigma = sigma1)
sample2 <- mvrnorm(n = 250, mu = as_vector(data[,2]), Sigma = sigma2)
sampCombined <- rbind(sample1, sample2);
covCombined <- cov(sampCombined);
covCombinedPCA <- prcomp(sampCombined);
eigenvalues <- covCombinedPCA$sdev^2;

I want to repeat/loop this, so that I have 50 vectors of eigenvalues. I then want to find the mean vector of eigenvalues over the 50 repetitions. How do I do this?


Solution

  • You can put your entire code in a function. Let's say the function is called eigen_fun.

    eigen_fun <- function(data) {
      sigma1 <- as.matrix((data[,3:22]))
      sigma2 <- as.matrix((data[,23:42]))
      sample1 <- mvrnorm(n = 250, mu = as_vector(data[,1]), Sigma = sigma1)
      sample2 <- mvrnorm(n = 250, mu = as_vector(data[,2]), Sigma = sigma2)
      sampCombined <- rbind(sample1, sample2);
      covCombined <- cov(sampCombined);
      covCombinedPCA <- prcomp(sampCombined);
      eigenvalues <- covCombinedPCA$sdev^2;
      return(eigenvalues)
    }
    

    Running eigen_fun(data) once gives you one set of values. To repeat this 50 times, you can use replicate.

    mat <- replicate(50, eigen_fun(data))
    

    Every column in mat is one set of value, to get mean of each iteration you can use colMeans :

    colMeans(mat)