Search code examples
rglm

R: Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family")


I tried to use glm for estimate soccer teams strengths.

# data is dataframe (structure on bottom). 
model <- glm(Goals ~ Home + Team + Opponent, family=poisson(link=log), data=data)

but get the error:

Error in if (any(y < 0)) stop("negative values not allowed for the 'Poisson' family") : 
  missing value where TRUE/FALSE needed
In addition: Warning message:
In Ops.factor(y, 0) : ‘<’ not meaningful for factors

data:

> data
                      Team                 Opponent Goals Home
1 5a51f2589d39c31899cce9d9 5a51f2579d39c31899cce9ce     3    1
2 5a51f2579d39c31899cce9ce 5a51f2589d39c31899cce9d9     0    0
3 5a51f2589d39c31899cce9da 5a51f2579d39c31899cce9cd     3    1
4 5a51f2579d39c31899cce9cd 5a51f2589d39c31899cce9da     0    0

> is.factor(data$Goals)
[1] TRUE

Solution

  • From the "details" section of documentation for glm() function:

    A typical predictor has the form response ~ terms where response is the (numeric) response vector and terms is a series of terms which specifies a linear predictor for response.

    So you want to make sure your Goals column is numeric:

    df <- data.frame( Team= c("5a51f2589d39c31899cce9d9", "5a51f2579d39c31899cce9ce", "5a51f2589d39c31899cce9da", "5a51f2579d39c31899cce9cd"),
                      Opponent=c("5a51f2579d39c31899cce9ce", "5a51f2589d39c31899cce9d9", "5a51f2579d39c31899cce9cd", "5a51f2589d39c31899cce9da "),
                      Goals=c(3,0,3,0),
                      Home=c(1,0,1,0))
    
    str(df)
    #'data.frame':  4 obs. of  4 variables:
    # $ Team    : Factor w/ 4 levels "5a51f2579d39c31899cce9cd",..: 3 2 4 1
    # $ Opponent: Factor w/ 4 levels "5a51f2579d39c31899cce9cd",..: 2 3 1 4
    # $ Goals   : num  3 0 3 0
    # $ Home    : num  1 0 1 0
    
    
    model <- glm(Goals ~ Home + Team + Opponent, family=poisson(link=log), data=df)
    

    Then here is the output:

    > model
    
    
    Call:  glm(formula = Goals ~ Home + Team + Opponent, family = poisson(link = log), 
        data = df)
    
    Coefficients:
                          (Intercept)                               Home       Team5a51f2579d39c31899cce9ce  
                           -2.330e+01                          2.440e+01                         -3.089e-14  
         Team5a51f2589d39c31899cce9d9       Team5a51f2589d39c31899cce9da   Opponent5a51f2579d39c31899cce9ce  
                           -6.725e-15                                 NA                                 NA  
     Opponent5a51f2589d39c31899cce9d9  Opponent5a51f2589d39c31899cce9da   
                                   NA                                 NA  
    
    Degrees of Freedom: 3 Total (i.e. Null);  0 Residual
    Null Deviance:      8.318 
    Residual Deviance: 3.033e-10    AIC: 13.98