Search code examples
rglmlogistic-regression

How to interpret unusual results from glm model?


I am using a logistic regression model to predict values in a raster dataset. Data used in the model are in the following format:

class     b1     b2     b3     b4
A         121    111    90     160
A         100    90     67     90
B         90     120    102    154
...

I would expect the output of the model to be categorical (A or B; there are only two classes). Instead, the glm model yields continuous values ranging from 0 - 1. Either my interpretation of the model output is incorrect, or am I coding this wrong. How should I interpret these results?

enter image description here


  # GLM
  myglm = glm(factor(class) ~ b1 + b2 + b3 + b4), data = df, family = binomial(link = "logit"))

  # Predict results and write to image 
  predict(sf, myglm, outpath, type="response", 
          index=1, na.rm=TRUE, progress="text", overwrite=TRUE)

Solution

  • The output is correct. You should interpret these values as probabilities. The Base class set's what the probability is for.

    The value 0.7 means a 70% probability of the data point belonging to class A(or B) depending on how you set the levels.

    If you want binary classes out you have to decide on a cut-off in probability. If the prevalence is 50% the 0.5 should suffice as a cut-off.