Search code examples
rlinear-regressionglmlmp-value

How do I use the glm() function?


I'm trying to fit a general linear model (GLM) on my data using R. I have a Y continuous variable and two categorical factors, A and B. Each factor is coded as 0 or 1, for presence or absence.

Even if just looking at the data I see a clear interaction between A and B, the GLM says that p-value>>>0.05. Am I doing something wrong?

First of all I create the data frame including my data for the GLM, which consists on a Y dependent variable and two factors, A and B. These are two level factors (0 and 1). There are 3 replicates per combination.

A<-c(0,0,0,1,1,1,0,0,0,1,1,1)
B<-c(0,0,0,0,0,0,1,1,1,1,1,1)
Y<-c(0.90,0.87,0.93,0.85,0.98,0.96,0.56,0.58,0.59,0.02,0.03,0.04)
my_data<-data.frame(A,B,Y)

Let’s see how it looks like:

my_data
##    A B    Y
## 1  0 0 0.90
## 2  0 0 0.87
## 3  0 0 0.93
## 4  1 0 0.85
## 5  1 0 0.98
## 6  1 0 0.96
## 7  0 1 0.56
## 8  0 1 0.58
## 9  0 1 0.59
## 10 1 1 0.02
## 11 1 1 0.03
## 12 1 1 0.04

As we can see just looking on the data, there is a clear interaction between factor A and factor B, as the value of Y dramatically decreases when A and B are present (that is A=1 and B=1). However, using the glm function I get no significant interaction between A and B, as p-value>>>0.05

attach(my_data)
## The following objects are masked _by_ .GlobalEnv:
## 
##     A, B, Y


my_glm<-glm(Y~A+B+A*B,data=my_data,family=binomial)
## Warning: non-integer #successes in a binomial glm!
summary(my_glm)
## 
## Call:
## glm(formula = Y ~ A + B + A * B, family = binomial, data = my_data)
## 
## Deviance Residuals: 
##       Min         1Q     Median         3Q        Max  
## -0.275191  -0.040838   0.003374   0.068165   0.229196  
## 
## Coefficients:
##             Estimate Std. Error z value Pr(>|z|)
## (Intercept)   2.1972     1.9245   1.142    0.254
## A             0.3895     2.9705   0.131    0.896
## B            -1.8881     2.2515  -0.839    0.402
## A:B          -4.1747     4.6523  -0.897    0.370
## 
## (Dispersion parameter for binomial family taken to be 1)
## 
##     Null deviance: 7.86365  on 11  degrees of freedom
## Residual deviance: 0.17364  on  8  degrees of freedom
## AIC: 12.553
## 
## Number of Fisher Scoring iterations: 6

Solution

  • The family=binomial implies Logit (Logistic) Regression, which is itself produces a binary result.

    From Quick-R

    Logistic Regression

    Logistic regression is useful when you are predicting a binary outcome from a set of continuous predictor variables. It is frequently preferred over discriminant function analysis because of its less restrictive assumptions.