Search code examples
rglmpoissonmle

hand-rolled R code for Poisson MLE


I'm attempting to write my own function to understand how the Poisson distribution behaves within a Maximum Likelihood Estimation framework (as it applies to GLM).

I'm familiar with R's handy glm function, but wanted to try and hand-roll some code to understand what's going on:

n <- 10000 # sample size
b0 <- 1.0 # intercept
b1 <- 0.2 # coefficient
x <- runif(n=n, min=0, max=1.5) # generate covariate values
lp <- b0+b1*x # linear predictor
lambda <- exp(lp) # compute lamda
y <- rpois(n=n, lambda=lambda) # generate y-values
dta <- data.frame(y=y, x=x) # generate dataset
negloglike <- function(lambda) {n*lambda-sum(x)*log(lambda) + sum(log(factorial(y)))} # build negative log-likelihood
starting.vals <- c(0,0) # one starting value for each parameter
pars <- c(b0, b1)
maxLike <- optim(par=pars,fn=negloglike, data = dta) # optimize

My R output when I enter maxLike is the following:

Error in fn(par, ...) : unused argument (data = list(y = c(2, 4....

I assume I've specified optim within my function incorrectly, but I'm not familiar enough with the nuts-and-bolts of MLE or constrained optimization to understand what I'm missing.


Solution

  • optim can only use your function in a certain way. It assumes the first parameter in your function takes in the parameters as a vector. If you need to pass other information to this function (in your case the data) you need to have that as a parameter of your function. Your negloglike function doesn't have a data parameter and that's what it is complaining about. The way you have it coded you don't need one so you probably could fix your problem by just removing the data=dat part of your call to optim but I didn't test that. Here is a small example of doing a simple MLE for just a poisson (not the glm)

    negloglike_pois <- function(par, data){
      x <- data$x
      lambda <- par[1]
    
      -sum(dpois(x, lambda, log = TRUE))
    }
    
    dat <- data.frame(x = rpois(30, 5))
    optim(par = 4, fn = negloglike_pois, data = dat)
    mean(dat$x)
    
    > optim(par = 4, fn = negloglike_pois, data = dat)
    $par
    [1] 4.833594
    
    $value
    [1] 65.7394
    
    $counts
    function gradient 
          22       NA 
    
    $convergence
    [1] 0
    
    $message
    NULL
    
    Warning message:
    In optim(par = 4, fn = negloglike_pois, data = dat) :
      one-dimensional optimization by Nelder-Mead is unreliable:
    use "Brent" or optimize() directly
    > # The "true" MLE. We didn't hit it exactly but came really close
    > mean(dat$x)
    [1] 4.833333