I want to use the mle
function to get estimates of a
and b
in a Unif(a,b)
distribution. But I get absurd estimates nowhere close to 1 and 3.
library(stats4)
set.seed(20161208)
N <- 100
c <- runif(N, 1, 3)
LL <- function(min, max) {
R <- runif(100, min, max)
suppressWarnings((-sum(log(R))))
}
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS",
lower = c(-Inf, 0), upper = c(Inf, Inf))
I got:
Call:
mle(minuslogl = LL, start = list(min = 1, max = 3), method = "BFGS")
Coefficients:
min max
150.8114 503.6586
Any ideas of what's going on? Thank you in advance!
I would first point out where your code is wrong.
You need dunif
not runif
. You may define:
LL <- function (a, b) -sum(dunif(x, a, b, log.p = TRUE))
In my code below I did not use dunif
, as the density is just 1 / (b - a)
so I wrote it directly.
U[a,b]
this is OK as its density is free of x
. But for other distributions the objective function changes at each iteration.method = "L-BFGS-B"
, not the ordinary "BFGS"
. And you are not using the right constraints.Now in more depth...
For a length-n
sample vector x
from U[a, b]
, the likelihood is (b - a) ^ (-n)
, and negative-log-likelihood is n * log(b - a)
. Obviously the MLE are a = min(x)
and b = max(x)
.
Numerical optimization is completely unnecessary, and is in fact impossible without constraints. Look at the gradient vector:
( n / (a - b), n / (b - a) )
The partial derivative w.r.t. a
/ b
is always negative / positive and can't be 0.
Numerical approach becomes feasible when we impose box constraints: -Inf < a <= min(x)
and max(x) <= b < Inf
. We know for sure that iteration terminates at the boundary.
My code below uses both optim
and mle
. Note mle
will fail, when it inverts Hessian matrix, as it is singular:
-(b - a) ^ 2 (b - a) ^ 2
(b - a) ^ 2 -(b - a) ^ 2
Code:
## 100 samples
set.seed(20161208); x <- runif(100, 1, 3)
# range(x)
# [1] 1.026776 2.984544
## using `optim`
nll <- function (par) log(par[2] - par[1]) ## objective function
gr_nll <- function (par) c(-1, 1) / diff(par) ## gradient function
optim(par = c(0,4), fn = nll, gr = gr_nll, method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf), hessian = TRUE)
#$par
#[1] 1.026776 2.984544 ## <- reaches boundary!
#
# ...
#
#$hessian ## <- indeed singular!!
# [,1] [,2]
#[1,] -0.2609022 0.2609022
#[2,] 0.2609022 -0.2609022
## using `stats4::mle`
library(stats4)
nll. <- function (a, b) log(b - a)
mle(minuslogl = nll., start = list(a = 0, b = 4), method = "L-BFGS-B",
lower = c(-Inf, max(x)), upper = c(min(x), Inf))
#Error in solve.default(oout$hessian) :
# Lapack routine dgesv: system is exactly singular: U[2,2] = 0