Search code examples
rlogistic-regressionglmstatistics-bootstrap

bootstrapping logistic model - some subsets do not converge


I want to bootstrap a logistic model. The model with the whole dataset converges fine. However, the boot function chooses subsets that do not converge anymore. What can I do?

library(boot)
set.seed(2)
y <-  c(rep(0,10),rep(1,10))
x <- c(rnorm(10,2,1),rnorm(10,6,1))
dat = data.frame(x, y)

fit <- glm(y ~ x, quasibinomial(), data=dat)           # Model with all data workes fine

bs <- function(data, indices) {
  d <- data[indices,]
  fitboot <- glm(y ~ x, family = quasibinomial(), data=d)
  return(coef(fitboot)) 
} 

results <- boot(data=dat, statistic=bs, R=10)          # I get warnings

I get warnings that say:

1: glm.fit: algorithm did not converge 
2: glm.fit: algorithm did not converge 
3: glm.fit: algorithm did not converge 
4: glm.fit: algorithm did not converge 

This seems to be due to the subsets chosen.

Interestingly this subset works:

 fit <- glm(y ~ x, quasibinomial(), data=dat[1:13,]) 

But this does not:

 fit <- glm(y ~ x, quasibinomial(), data=dat[1:14,]) 

Why is that? And what can I do to bootstrap this model?


Solution

  • The iteratively re-weighted least squares algorithm failed to converge in the default 25 iterations. The warning goes away if you increase the number of iterations as shown below:

    fitboot <- glm(y ~ x, family = quasibinomial(), data=d,
                   control=glm.control(maxit=50))