I am trying to get my head around what the predict.glm() function does for a project at work which uses it.
To do this, I first looked at the example code found in the documentation for ?predict.glm(). This has given me the sense that it can take a glm and predict response values for a given input vector. However I found it very difficult to customise that "budworm" example. So I created an exceptionally simply model of my own to try and see how it works. Spoiler- I'm still failing to get it to work.
a<-c(1,2,3,4,5)
b<-c(2,3,4,5,6)
result<-glm(b~a,family=gaussian)
summary(result)
plot(c(0,10), c(0,10), type = "n", xlab = "dose",
ylab = "response")
xvals<-seq(0,10,0.1)
data.frame(xinputs=xvals)
predict.glm(object=result,newdata= data.frame(xinputs=xvals),type='terms')
#lines(xvals, predict.glm(object=result,newdata = xvals, type="response" ))
When I run predict.glm(object=result,newdata= data.frame(xinputs=xvals),type='terms')
I get the error message:
Warning message:
'newdata' had 101 rows but variables found have 5 rows
From what I understand, it shouldn't matter that the input GLM only used 5 rows... it should use the statistics of that GLM to predict response values to each of the 101 entries of the new data?
Column names in the newdata
data frame must match column names from the data you used to fit the model. Thus,
predict.glm(object=result,newdata= data.frame(a=xvals),type='terms')
will resolve your issue.
a <- c(1, 2, 3, 4, 5)
b <- c(2, 3, 4, 5, 6)
result <- glm(b ~ a, family = gaussian)
summary(result)
#>
#> Call:
#> glm(formula = b ~ a, family = gaussian)
#>
#> Deviance Residuals:
#> 1 2 3 4 5
#> -1.776e-15 -8.882e-16 -8.882e-16 0.000e+00 0.000e+00
#>
#> Coefficients:
#> Estimate Std. Error t value Pr(>|t|)
#> (Intercept) 1.000e+00 1.317e-15 7.591e+14 <2e-16 ***
#> a 1.000e+00 3.972e-16 2.518e+15 <2e-16 ***
#> ---
#> Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#>
#> (Dispersion parameter for gaussian family taken to be 1.577722e-30)
#>
#> Null deviance: 1.0000e+01 on 4 degrees of freedom
#> Residual deviance: 4.7332e-30 on 3 degrees of freedom
#> AIC: -325.47
#>
#> Number of Fisher Scoring iterations: 1
plot(c(0, 10),
c(0, 10),
type = "n",
xlab = "dose",
ylab = "response")
xvals <- seq(0, 10, 0.1)
head(data.frame(xinputs = xvals))
#> xinputs
#> 1 0.0
#> 2 0.1
#> 3 0.2
#> 4 0.3
#> 5 0.4
#> 6 0.5
head(predict.glm(object = result,
newdata = data.frame(a = xvals),
type = 'terms'))
#> a
#> 1 -3.0
#> 2 -2.9
#> 3 -2.8
#> 4 -2.7
#> 5 -2.6
#> 6 -2.5
Created on 2020-09-15 by the reprex package (v0.3.0)