Search code examples
roptimizationregressionlogistic-regressionglm

Estimate a probit regression model with optim()


I need to manually program a probit regression model without using glm. I would use optim for direct minimization of negative log-likelihood.

I wrote code below but it does not work, giving error:

cannot coerce type 'closure' to vector of type 'double'

# load data: data provided via the bottom link
Datospregunta2a <- read.dta("problema2_1.dta")
attach(Datospregunta2a)

# model matrix `X` and response `Y`
X <- cbind(1, associate_professor, full_professor, emeritus_professor, other_rank)
Y <- volunteer

# number of regression coefficients
K <- ncol(X)

# initial guess on coefficients
vi <- lm(volunteer ~ associate_professor, full_professor, emeritus_professor, other_rank)$coefficients

# negative log-likelihood
probit.nll <- function (beta) {
  exb <- exp(X%*%beta)
  prob<- rnorm(exb)
  logexb <- log(prob)
  y0 <- (1-y)
  logexb0 <- log(1-prob)
  yt <- t(y)
  y0t <- t(y0)
  -sum(yt%*%logexb + y0t%*%logexb0)
  }

# gradient
probit.gr <- function (beta) {
  grad <- numeric(K)
  exb <- exp(X%*%beta)
  prob <- rnorm(exb)
  for (k in 1:K) grad[k] <- sum(X[,k]*(y - prob))
  return(-grad)
  }

# direct minimization
fit <- optim(vi, probit.nll, gr = probit.gr, method = "BFGS", hessian =  TRUE)

data: https://drive.google.com/file/d/0B06Id6VJyeb5OTFjbHVHUE42THc/view?usp=sharing


Solution

  • case sensitive

    Y and y are different. So you should use Y not y in your defined functions probit.nll and probit.gr.

    These two functions also do not look correct to me. The most evident problem is the existence of rnorm. The following are correct ones.

    negative log-likelihood function

    # requires model matrix `X` and binary response `Y`
    probit.nll <- function (beta) {
      # linear predictor
      eta <- X %*% beta
      # probability
      p <- pnorm(eta)
      # negative log-likelihood
      -sum((1 - Y) * log(1 - p) + Y * log(p))
      }
    

    gradient function

    # requires model matrix `X` and binary response `Y`
    probit.gr <- function (beta) {
      # linear predictor
      eta <- X %*% beta
      # probability
      p <- pnorm(eta)
      # chain rule
      u <- dnorm(eta) * (Y - p) / (p * (1 - p))
      # gradient
      -crossprod(X, u)
      }
    

    initial parameter values from lm()

    This does not sound like a reasonable idea. In no cases should we apply linear regression to binary data.

    However, purely focusing on the use of lm, you need + not , to separate covariates in the right hand side of the formula.


    reproducible example

    Let's generate a toy dataset

    set.seed(0)
    # model matrix
    X <- cbind(1, matrix(runif(300, -2, 1), 100))
    # coefficients
    b <- runif(4) 
    # response
    Y <- rbinom(100, 1, pnorm(X %*% b))
    
    # `glm` estimate
    GLM <- glm(Y ~ X - 1, family = binomial(link = "probit"))
    
    # our own estimation via `optim`
    # I am using `b` as initial parameter values (being lazy)
    fit <- optim(b, probit.nll, gr = probit.gr, method = "BFGS", hessian = TRUE)
    
    # comparison
    unname(coef(GLM))
    # 0.62183195  0.38971121  0.06321124  0.44199523
    
    fit$par
    # 0.62183540  0.38971287  0.06321318  0.44199659
    

    They are very close to each other!