Search code examples
rfunctionsimulationeigenvalue

Simulation of the sampling distribution of the largest eigen value


I want to find the largest eigenvalue of X (following a Wishart Distribution). And I use the simulation to see the empirical distribution of those eigenvalues. But when I code it this way

library(MASS)

 function(X){
    maxeigen.XtX <- NULL
    num_samples <- 1000
    for(i in 1:num_samples){
      X <- mvrnorm(n=10,mu=rep(0,3),Sigma = matrix(c(1,0.2,0.1,0.2,1,0.2,0.1,0.2,1),nrow=3))
      XtX <- t(X)%*%X
      maxeigen.XtX[i] <- max(eigen(XtX)$values)
      }
      return(maxeigen.XtX)
      summary <- summary(maxeigen.XtX)
      histgram <- hist(maxeigen.XtX,breaks=100)
    }

It doesn't give my anything. No sure where the problem is?


Solution

  • Let A be your target covariance matrix:

    A <- matrix(c(1,0.2,0.1,0.2,1,0.2,0.1,0.2,1), nrow = 3)
    
    #     [,1] [,2] [,3]
    #[1,]  1.0  0.2  0.1
    #[2,]  0.2  1.0  0.2
    #[3,]  0.1  0.2  1.0
    

    Here is a working function for getting N samples of the largest eigen value. It is much more efficient than using MASS::mvrnorm, as matrix factorization of A is only done once rather than N times.

    g <- function (N, n, A) {
      ## get upper triangular Choleksy factor of covariance `A`
      R <- chol.default(A)
      ## a function to generate `n` samples from `N(0, A)`
      ## and get largest eigen value
      f <- function (n, R) {
        Xstd <- matrix(rnorm(n * dim(R)[1L]), n)  ## `n` standard normal samples
        X <- Xstd %*% R  ## transform to have covariance `A`
        S <- crossprod(X)  ## `X'X`
        max(eigen(S, symmetric = TRUE)$values)  ## symmetric eigen decomposition
        }
      ## replicate `N` times for `N` samples of largest eigen values
      replicate(N, f(n, R))
      }
    
    ## try `N = 1000`, `n = 10`, as in your original code
    set.seed(0); x <- g(1000, 10, A)
    

    Note, I don't ask g to do summary and plot. Because as long as we have samples we can do it any time.

    d <- density.default(x)  ## density estimation
    h <- hist.default(x, plot = FALSE)  ## histogram
    graphics:::plot.histogram(h, freq = FALSE, ylim = c(0, max(h$density, d$y)),
                              main = "histogram of largest Eigen value")
    lines(d$x, d$y, col = 2)
    

    enter image description here