Search code examples
rmaxanalysissurvivallog-likelihood

Is there a way so that an already defined argument can appear as missing when using optim() function in R?


I am trying to get the Maximum Likelihood Estimators of the log-likelihood of a Gumbel distribution for survival analysis(i say that so that you dont get astranged by the log-likelihood function, i think its correct). In order to do that i have to maximize the minus log-likelihood by using the optim function, i tried to do so but the console is giving me an Error in fn(par, ...) : argument "b" is missing, with no default.

I tried also to do it in a similar way as in the answer to this link: Solve for maximum likelihood with two parameters under constraints ,but the console game me the following: Error in optim(c(1, 1), function(x) log_lhood(x[1], x[2], d = lung$status, : objective function in optim evaluates to length 0 not 1.

log_lhood <- function(m,b,d,t){                           
  sum<-0
  for (i in 1:length(lung)){ 
    if (d[i] == 1){
      sum<- sum - log(1-exp(-exp(-(t[i]-m)/b)))
    } else {
      sum<- sum - log((1/b)*exp(-(t[i]-m/b+exp(-(t[i]-m/b)))))
    }
  }
}
#a,b parameter optimization

optim(c(0,0), fn = log_lhood, d = lung$status, t = KM_fit$time) #1st way

optim(c(1, 1), function(x) log_lhood(x[1], x[2],d=lung$status,t=KM_fit_test$time)) #2nd way as in the link


Solution

  • There are a few problems here. The first argument of the function is supposed to be a vector of parameters. Also you need nrow(lung) not length(lung), and it would be better to use length(d). Also you should not use a loop here, it's very inefficient, use ifelse() (in R we always try to vectorise everything). Also you need to check that the log likelihood can be calculated for all values of the parameters (e.g. b=0). Also you forgot to return(sum). Also sum is a useful function you shouldn't mask.

    This runs.

    library(reprex)
    
    lung <- data.frame(status=c(0,0,1,1))
    KM_fit <- data.frame(time=c(0,1,2,3))
    
    log_lhood <- function(x,d,t){
        m <- x[1]
        b <- x[2]
        sum <-0
        for (i in 1:nrow(lung)){
            if (d[i] == 1){
                sum <- sum - log(1-exp(-exp(-(t[i]-m)/b)))
            } else {
                sum <- sum - log((1/b)*exp(-(t[i]-m/b+exp(-(t[i]-m/b)))))
            }
        }
        return(sum)
    }
    #a,b parameter optimization
    
    optim(par=c(0,1), fn = log_lhood, d = lung$status, t = KM_fit$time)
    $par
    [1] 1.661373 1.811780
    
    $value
    [1] 5.318068
    
    $counts
    function gradient 
          63       NA 
    
    $convergence
    [1] 0
    
    $message
    NULL
    

    You could rewrite your function like this.

    log_lhood <- function(x,d,t){
        m <- x[1]
        b <- x[2]
        s <- ifelse(d==1,
                      -log(1-exp(-exp(-(t-m)/b))),
                      -log((1/b)*exp(-(t-m/b+exp(-(t-m/b)))))
                      )
        return(sum(s, na.rm=TRUE))
    }