I am working with financial/economic data in case you are wondering about the large size of some of the coefficients below... My general question has to do with the simulation of parameter coefficients output from a linear random effects model in R. I am attempting to generate a random sample of beta coefficients using the model coefficients and the variance-covariance (VCOV) matrix from the same model in R. My question is: Why am I receiving the error below about the square root of the expected values using the rmvnorm() function from the mvtnorm{} package? How can I deal with this warning/issue?
#Example call: lmer model with random effects by YEAR
#mlm<-lmer(DV~V1+V2+V3+V2*V3+V4+V5+V6+V7+V8+V9+V10+V11+(1|YEAR), data=dat)
#Note: 5 years (5 random effects total)
#LMER call yields the following information:
coef<-as.matrix(c(-28037800,0.8368619,2816347,8681918,-414002.6,371010.7,-26580.84,80.17909,271.417,-239.1172,3.463785,-828326))
sigma<-as.matrix(rbind(c(1834279134971.21,-415.95,-114036304870.57,-162630699769.14,-23984428143.44,-94539802675.96,
-4666823087.67,-93751.98,1735816.34,-1592542.75,3618.67,14526547722.87),
c(-415.95,0.00,41.69,94.17,-8.94,-22.11,-0.55,0.00,0.00,0.00,0.00,-7.97),
c(-114036304870.57,41.69,12186704885.94,12656728536.44,-227877587.40,-2267464778.61,
-4318868.82,8909.65,-355608.46,338303.72,-321.78,-1393244913.64),
c(-162630699769.14,94.17,12656728536.44,33599776473.37,542843422.84,4678344700.91,-27441015.29,
12106.86,-225140.89,246828.39,-593.79,-2445378925.66),
c(-23984428143.44,-8.94,-227877587.40,542843422.84,32114305557.09,-624207176.98,-23072090.09,
2051.16,51800.37,-49815.41,-163.76,2452174.23),
c(-94539802675.96,-22.11,-2267464778.61,4678344700.91,-624207176.98,603769409172.72,90275299.55,
9267.90,208538.76,-209180.69,-304.18,-7519167.05),
c(-4666823087.67,-0.55,-4318868.82,-27441015.29,-23072090.09,90275299.55,82486186.42,-100.73,
15112.56,-15119.40,-1.34,-2476672.62),
c(-93751.98,0.00,8909.65,12106.86,2051.16,9267.90,-100.73,2.54,8.73,-10.15,-0.01,-1507.62),
c(1735816.34,0.00,-355608.46,-225140.89,51800.37,208538.76,15112.56,8.73,527.85,-535.53,-0.01,21968.29),
c(-1592542.75,0.00,338303.72,246828.39,-49815.41,-209180.69,-15119.40,-10.15,-535.53,545.26,0.01,-23262.72),
c(3618.67,0.00,-321.78,-593.79,-163.76,-304.18,-1.34,-0.01,-0.01,0.01,0.01,42.90),
c(14526547722.87,-7.97,-1393244913.64,-2445378925.66,2452174.23,-7519167.05,-2476672.62,-1507.62,21968.29,
-23262.72,42.90,229188496.83)))
#Error begins here:
betas<-rmvnorm(n=1000, mean=coef, sigma=sigma)
#rmvnorm breaks, Error returned:
Warning message: In sqrt(ev$values) : NaNs produced
When I Google the following search string: "rmvnorm, "Warning message: In sqrt(ev$values) : NaNs produced," I saw that: http://www.nickfieller.staff.shef.ac.uk/sheff-only/mvatasksols6-9.pdf On Page 4 that this error indicates "negative eigen values." Although, I have no idea conceptually or practically what a negative eigen value is or why that they would be produced in this instance.
The second search result: [http://www.r-tutor.com/r-introduction/basic-data-types/complex2 Indicates that this error arises because of an attempt to take the square root of -1, which is "not a complex value" (you cannot take the square root of -1).
The question remains, what is going on here with the random generation of the betas, and how can this be corrected?
sessionInfo() R version 3.0.2 (2013-09-25) Platform: x86_64-apple-darwin10.8.0 (64-bit)
Using the following packages/versions mvtnorm_0.9-9994, lme4_1.1-5, Rcpp_0.10.3, Matrix_1.1-2-2, lattice_0.20-23
You have a huge range of scales in your eigenvalues:
range(eigen(sigma)$values)
## [1] -1.005407e-05 1.863477e+12
I prefer to use mvrnorm
from the MASS package, just because it comes installed automatically with R. It also appears to be more robust:
set.seed(1001)
m <- MASS::mvrnorm(n=1000, mu=coef, Sigma=sigma) ## works fine
edit: OP points out that using method="svd"
with rmvnorm
also works.
If you print the code for MASS::mvrnorm
, or debug(MASS:mvrnorm)
and step through it, you see that it uses
if (!all(ev >= -tol * abs(ev[1L]))) stop("'Sigma' is not positive definite")
(where ev
is the vector of eigenvalues, in decreasing order, so ev[1]
is the largest eigenvalue) to decide on the positive definiteness of the variance-covariance matrix. In this case ev[1L]
is about 2e12, tol
is 1e-6, so this would allow negative eigenvalues up to a magnitude of about 2e6. In this case the minimum eigenvalue is -1e-5, well within tolerance.
Farther down MASS::mvrnorm
uses pmax(ev,0)
-- that is, if it has decided that the eigenvalues are not below tolerance (i.e. it didn't fail the test above), it just truncates the negative values to zero, which should be fine for practical purposes.
If you insisted on using rmvnorm
you could use Matrix::nearPD
, which tries to force the matrix to be positive definite -- it returns a list which contains (among other things) the eigenvalues and the "positive-definite-ified" matrix:
m <- Matrix::nearPD(sigma)
range(m$eigenvalues)
## [1] 1.863477e+04 1.863477e+12
The eigenvalues computed from the matrix are not quite identical -- nearPD
and eigen
use slightly different algorithms -- but they're very close.
range(eigen(m$mat)$values)
## [1] 1.861280e+04 1.863477e+12
More generally,
dput(vcov(fit))
or save(vcov(fit))
to save the variance-covariance matrix at full precision is safer.