Search code examples
rmathoptimizationhessianmle

R optim(){fExtremes} gets 0 hessian matrix


I am using R {fExtremes} to find best parameters of GEV distribution for my data (a vector). but get the following error message

Error in solve.default(fit$hessian) : Lapack routine dgesv: system is exactly singular: U[1,1] = 0

I traced back to fit$hessian, found my hessian matrix is a sigular matrix, all of the elements are 0s. The source code (https://github.com/cran/fExtremes/blob/master/R/GevFit.R) of gevFit() shows fit$hessian is calculated by optim(). The output parameters are exactly the same value as the initial parameters. I am wondering what could be the problems of my data that cause this problem? I copied my code here

> min(sample);
[1] 5.240909

> max(sample)
[1] 175.8677

> length(sample)
[1] 6789

> mean(sample)
[1] 78.04107

>para<-gevFit(sample, type = "mle")
Error in solve.default(fit$hessian) : 
  Lapack routine dgesv: system is exactly singular: U[1,1] = 0

fit = optim(theta, .gumLLH, hessian = TRUE, ..., tmp = data)
> fit

   $par

xi   -0.3129225
mu   72.5542497 
beta  16.4450897 

$value
[1] 1e+06

$counts
function gradient 
       4       NA 

$convergence
[1] 0

$message
NULL



$hessian

     xi  mu beta

xi    0    0     0

mu    0    0      0

beta  0     0      0

I updated my dataset on google docs: https://docs.google.com/spreadsheets/d/1IRRpjmdrrJPhNmfiLism_P0efV_Ot4HlEsa6kwMnljc/edit?usp=sharing


Solution

  • This is going to be a long story, possibly more suited to https://stats.stackexchange.com/.

    ====== Part 1 -- The problem ======

    This is the sequence generating the error:

    library(fExtremes)
    samp <- read.csv("optimdata.csv")[ ,2]
    ## does not converge
    para <- gevFit(samp, type = "mle")
    

    We are facing the typical cause of lack-of-convergence when using optim() and friends: inadequate starting values for the optimisation.

    To see what goes wrong, let us use the PWM estimator (http://arxiv.org/abs/1310.3222); this consists of an analytical formula, hence it does not incur into convergence problems, since it makes no use of optim():

    para <- gevFit(samp, type = "pwm")
    fitpwm<- attr(para, "fit")
    fitpwm$par.ests
    

    The estimated tail parameter xi is negative, corresponding to a bounded upper tail; in fact the fitted distribution displays even more "upper tail boundedness" than the sample data, as you can see from the "leveling off" of the quantile-quantile graph at the right:

    qqgevplot <- function(samp, params){
      probs <- seq(0.1,0.99,by=0.01)
      qqempir <- quantile(samp, probs)
      qqtheor <-  qgev(probs, xi=params["xi"], mu=params["mu"], beta=params["beta"])
      rang <- range(qqempir,qqtheor)
      plot(qqempir, qqtheor, xlim=rang, ylim=rang,
         xlab="empirical", ylab="theoretical",
         main="Quantile-quantile plot")
      abline(a=0,b=1, col=2)
    }
    qqgevplot(samp, fitpwm$par.ests)
    

    For xi<0.5 the MLE estimator is not regular (http://arxiv.org/abs/1301.5611): the value of -0.46 estimated by PWM for xi is very close to that. Now the PWM estimates are used internally by gevFit() as starting values for optim(): you can see this if you print out the code for the function gevFit():

    print(gevFit)
    print(.gevFit)
    print(.gevmleFit)
    

    The starting value for optim is theta, obtained by PWM. For the specific data at hand, this starting value is not adequate, in that it leads to non-convergence of optim().

    ====== Part 2 -- solutions? ======

    Solution 1 is to use para <- gevFit(samp, type = "pwm") as above. If you'd like to use ML, then you have to specify good starting values for optim(). Unfortunately, the fExtremes package does not make it easy to do so. You can then re-define your own version of .gevmleFit to include those, e.g.

    .gevmleFit <- function (data, block = NA, start.param, ...) 
    {
      data = as.numeric(data)
      n = length(data)
      if(missing(start.param)){
        theta = .gevpwmFit(data)$par.ests
      }else{
        theta = start.param
      }
      fit = optim(theta, .gevLLH, hessian = TRUE, ..., tmp = data)
      if (fit$convergence) 
        warning("optimization may not have succeeded")
      par.ests = fit$par
      varcov = solve(fit$hessian)
      par.ses = sqrt(diag(varcov))
      ans = list(n = n, data = data, par.ests = par.ests, par.ses = par.ses, 
                 varcov = varcov, converged = fit$convergence, nllh.final = fit$value)
      class(ans) = "gev"
      ans
    }
    ## diverges, just as above
    .gevmleFit(samp)
    ## diverges, just as above
    startp <- fitpwm$par.ests
    .gevmleFit(samp, start.param=startp)
    ## converges
    startp <- structure(c(-0.1, 1, 1), names=names(fitpwm$par.ests))
    .gevmleFit(samp, start.param=startp)$par.ests
    

    Now check this out: the beta estimated by PWM is 0.1245; by changing this to a tiny amount, the MLE is made to converge:

    startp <- fitpwm$par.ests
    startp["beta"]
    startp["beta"] <- 0.13
    .gevmleFit(samp, start.param=startp)$par.ests
    

    This hopefully clearly illustrates that to blindly optim()ise works until it doesn't and might then turn into a quite delicate endeavour indeed. For this reason, it might be useful to leave this reply here, rather than to migrate to CrossValidated.