Search code examples
rglm

Why is my glm still analyzing muliple variables when I use as.factor()?


I am trying to run a glm that looks at the effects of food type, habitat, and starvation period on food preference in Ants, however I simply want to look at food type as a single factor, even though I provide the ants with 5 foods. I have used as.factor on the food variable, but it still doesn't seem to work! I want a single p-value for how food affects individuals. Am I missing something?

  NumofAnts FoodType Trial SiteType
1         0     Pink     1  natural
2         4     Pink     1  natural
3         5     Pink     1  natural
4         4     Pink     1  natural
5         8     Pink     1  natural
6         5     Pink     1  natural
fit<-glm(NumofAnts~as.factor(FoodType) + Trial + SiteType, 
family=poisson(link=log), data=stacked1)
glm(formula = NumofAnts ~ as.factor(FoodType) + Trial + SiteType, 
    family = poisson(link = log), data = stacked1)

Deviance Residuals: 
    Min       1Q   Median       3Q      Max  
-3.5644  -2.2495  -1.0023   0.8588   8.8051  

Coefficients:
                          Estimate Std. Error z value Pr(>|z|)    
(Intercept)                1.46177    0.08031  18.202  < 2e-16 ***
as.factor(FoodType)Blue   -0.66665    0.06824  -9.769  < 2e-16 ***
as.factor(FoodType)Green  -0.29987    0.06093  -4.922 8.57e-07 ***
as.factor(FoodType)Yellow -0.28086    0.06060  -4.635 3.57e-06 ***
as.factor(FoodType)Red    -0.92502    0.07459 -12.401  < 2e-16 ***
Trial                      0.19355    0.04327   4.473 7.73e-06 ***
SiteTypeurban             -0.19730    0.04328  -4.558 5.16e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

Solution

  • A glm will estimate one coefficient (so one p-value) when the variable is numeric. But when the variable is categoric (like food on your case) it will calculate one coefficient for each level (except one) of your variable. In your case, food has 5 levels so 4 coefficients are estimated (so 4 p-values).