Search code examples
rvariablesmodelregressionlogistic-regression

Model selection gives back intercept only model


I am performing logistic regression on the model with CHD sickness vs a few variables (see the data frame).

ind sbp tobacco  ldl adiposity typea obesity alcohol age chd
1   1 160   12.00 5.73     23.11    49   25.30   97.20  52   1
2   2 144    0.01 4.41     28.61    55   28.87    2.06  63   1
...

I performed backward stepwise selection on this model to receive the best model, but I get as the result the model that contains only the intercept. Why can it be? What does it mean?

model <-glm(chd ~ ., data = CHD, family  = "binomial"(link = logit))
intercept_only <- glm(chd ~ 1, data=CHD,  family  = "binomial"(link = logit))

#perform backward stepwise regression
back <- step(intercept_only, direction='backward', scope=formula(model), trace=0)

#view results of backward stepwise regression

  Step Df Deviance Resid. Df Resid. Dev      AIC
1      NA       NA       461   596.1084 598.1084```

Solution

  • To do backward regression, you should start with a model that contains variables, rather than the model with intercept only:

    back <- step(model, direction='backward', scope=formula(model), trace=0)

    The intercept_only model should only be used if you set direction='forward' or direction='both'.