I am trying to get the Hessian matrix from my own data, and I have two results -
The result from the Hessian is very very small relative to the result from the numericHessian.
In this case, which results should I trust?
Specifically, the data I used ranged from 350000 to 1100000 and they were 9X2 matrix with a total of 18 data values.
I used with a sort of standard deviation formula and the result from "numericHessian" was ranging from 230 to 466 with 2X2 matrix, whereas the result from "Hessian" ranged from -3.42e-18 to 1.34e-17 which was much less than the previous one.
Which one do you think is correct calculation for the sort of standard deviation?
The code is as follows:
data=read.table("C:/file.txt", header=T);
data <- as.matrix(data);
library(plyr)
library(MASS)
w1 = tail(data/(rowSums(data)),1)
w2 = t(w1)
f <- function(x){
w1 = tail(x/(rowSums(x)),1)
w2 = t(w1)
r = ((w1%*%cov(cbind(x))%*%w2)^(1/2))
return(r)
}
library(maxLik);
numericHessian(f, t0=rbind(data[1,1], data[1,2]))
library(numDeriv);
hessian(f, rbind(data[1,1], data[1,2]), method="Richardson")
The file.txt is the following:
1 2
137 201
122 342
142 111
171 126
134 123
823 876
634 135
541 214
423 142
The result from the "numericHessian" is:
[,1] [,2]
[1,] 0.007105427 0.007105427
[2,] 0.007105427 0.000000000
Then, the result from the "Hessian" is:
[,1] [,2]
[1,] -3.217880e-15 -1.957243e-16
[2,] -1.957243e-16 1.334057e-16
Thank you very much in advance.
You have not given a reproducible example, but I'll try anyway.
library(bbmle)
x <- 0:10
y <- c(26, 17, 13, 12, 20, 5, 9, 8, 5, 4, 8)
d <- data.frame(x,y)
LL <- function(ymax=15, xhalf=6)
-sum(stats::dpois(y, lambda=ymax/(1+x/xhalf), log=TRUE))
fit <- mle2(LL)
cc <- coef(fit)
Here are the finite-difference estimates of the Hessians (matrices of second derivatives) of the negative log-likelihood function at the MLE: inverting these matrices gives an estimate of the variance-covariance matrices of the parameters.
library(numDeriv)
hessian(LL,cc)
## [,1] [,2]
## [1,] 1.296717e-01 -1.185789e-15
## [2,] -1.185789e-15 4.922087e+00
library(maxLik)
numericHessian(LL, t0=cc)
## [,1] [,2]
## [1,] 0.1278977 0.000000
## [2,] 0.0000000 4.916956
So for this relatively trivial example, numDeriv::hessian
and maxLik::numericHessian
give very similar results. So there must be something you haven't shown us, or something special about the numerics of your problem. In order to proceed further, we need a reproducible example please ...
dat <- matrix(c(137,201,122,342,142,111,
171,126,134,123,823,876,
634,135,541,214,423,142),
byrow=TRUE,ncol=2)
f <- function(x){
w1 <- tail(x/(rowSums(x)),1)
sqrt(w1%*%cov(cbind(x))%*%t(w1))
}
p <- t(dat[1,1:2,drop=FALSE])
f(p) ## 45.25483
numDeriv::hessian(f,p)
## [,1] [,2]
## [1,] -3.217880e-15 -1.957243e-16
## [2,] -1.957243e-16 1.334057e-16
maxLik::numericHessian(f,t0=p)
## [,1] [,2]
## [1,] 0.007105427 0.007105427
## [2,] 0.007105427 0.000000000
OK, these clearly disagree. I'm not sure why, but in this particular case we can analyze what you're doing and see which one is right:
x/rowSums(x)
is a vector of ones, so the last element (w1 <- tail(...,1)
) is just 1.sqrt(cov(cbind(x)))
. Again, since x
is a one-column matrix, cov()
is just the variance, and sqrt(cov(.))
is just the standard deviation, or the norm of the vector.numDeriv::hessian
is giving the right answerWe can also confirm this by increasing eps
for numericHessian
:
maxLik::numericHessian(f,t0=p,eps=1e-3)
## [,1] [,2]
## [1,] 0 0.000000e+00
## [2,] 0 -7.105427e-09
The bottom line is that numDeriv
uses a more accurate (but slower) method, but you can get reasonable answers from numericHessian
if you're careful.