Search code examples
rnaglm

Why is one variable showing up with NA's in my quasipoisson glm?


I have carried out a quasipoisson GLM on my data and simplified my model however when I view the summary I can see that one of my variables (Month) is coming up with NA's. This variable is a character however so are some of my other variables and this hasn't caused any issues.

Does anyone know why this variable might be coming up as NA and how to fix it?

Dataset

Date DOY Month Species Quantity Flower selection
13/07/2020 195 Jul20 B Lucorum 13 Lavendula
13/07/2020 195 Jul20 B Lapidarius 1 Verbena
13/07/2020 195 Jul20 B Terrestris 3 Centaurea
13/07/2020 195 Jul20 B Pascorum 1 Vicia craccu
13/07/2020 195 Jul20 B Lapidarius 7 Phalcelia
13/07/2020 195 Jul20 B Terrestris 4 Lavendula
13/07/2020 195 Jul20 B Terrestris 9 Verbena
13/07/2020 195 Jul20 B Lapidarius 1 Phalcelia
13/07/2020 195 Jul20 B Lucorum 3 Lavendula
g1 <- glm(Quantity ~ Location + Recorder + Species + Flower.selection + Date + Month,
          family = quasipoisson(), 
          data = BW)

g3<-update(g2,~.-Recorder, family = quasipoisson())

summary(g3)

A sample of the results from summary(g3):

MonthSep-20                                       NA         NA      NA       NA    
MonthMar-21                                       NA         NA      NA       NA    
MonthApr-21                                       NA         NA      NA       NA    
MonthMay-21                                       NA         NA      NA       NA    
MonthJun-21                                       NA         NA      NA       NA    
MonthJul-21                                       NA         NA      NA       NA

Solution

  • Date and Month are redundant predictors in your model: once you know the date, the month is fully specified (so adding the month to your model won't add any information). Redundant/collinear parameters mess up the linear algebra that is used internally to do the computations, so they are omitted/assigned NA values.

    You could leave out one or the other (if both are factors, then dropping month won't affect the model at all), or use a mixed model with month and date as random effects grouping variables (date nested within month) [although depending on the package you use, family = "quasipoisson" may not be available: see the relevant section of the GLMM FAQ for alternatives]