I've made a GAM model in R using the following code:
mod_gam1 <-gam(y ~ s(ï..x), data=Bird.data, method = "REML")
plot(mod_gam1)
coef(mod_gam1)
plot(mod_gam1, residuals = TRUE, pch = 1)
coef(mod_gam1)
mod_gam1$fitted.values
result <- data.frame(data = c(mod_gam1$fitted.values, Bird.data$y), Year = rep(1991:2019, times = 2),
'source' = c(rep('Modelled', times = 29), rep('Observed', times = 29)))
ggplot(result, aes(x = Year, y = data, colour = source))+ geom_point()+ geom_smooth(span= 0.8)+labs(x="Year", y = "Bird Island Total Debris Count")+ scale_y_continuous(limits = c(0,1000))
and the output looks ok but the shaded area of the geom_smooth
error doesn't extend to the whole of my dataset (stops short of my first two datapoints) and I am not sure why.
Any help would be appreciated!
I can't upload a picture as I am new to the site, but yeah basically I have two datasets (observed and GAM modelled values) which both have their SE confidence ribbon, but these start two datapoints in to my datasets not at the first points.
These are my datapoints: Bird.data
ï..x | y |
---|---|
1991 | 17 |
1992 | 76 |
1993 | 328 |
1994 | 131 |
1995 | 425 |
1996 | 892 |
1997 | 501 |
1998 | 419 |
1999 | 297 |
2000 | 277 |
2001 | 310 |
2002 | 282 |
2003 | 189 |
2004 | 278 |
2005 | 322 |
2006 | 444 |
2007 | 412 |
2008 | 241 |
2009 | 242 |
2010 | 255 |
2011 | 289 |
2012 | 335 |
2013 | 279 |
2014 | 628 |
2015 | 500 |
2016 | 174 |
2017 | 636 |
2018 | 420 |
2019 | 447 |
Fitted Values
[1] 95.56189 177.01468 255.17074 324.97532 380.28813 415.71334 428.67793 420.86624 398.18522 369.06325
[11] 341.72715 321.65585 310.33971 305.81158 304.53360 303.60521 302.21413 301.75501 304.77184 313.43400
[21] 328.37279 348.39076 371.04203 393.66222 414.29754 432.15104 447.48020 461.14595 474.09266
Negative Binomial
It is because of the limits you have put using scale_y_continuous
. If you remove that line (or adjust the y down, so that it allows the minimum y value of the smooth, then you will see the smooth fill completely.
However, you have a larger problem here. You are not actually showing the gam model in the smooth (only the gam point predictions). There are a couple of ways to do this.. Easiest might be to feed Bird.data directly to the ggplot function, and use the method
and formula
params of the geom_smooth()
to directly request the gam smooth:
ggplot(Bird.data, aes(x,y)) +
geom_point() +
geom_smooth(method="gam", formula=y~s(x)) +
labs(x="Year", y = "Bird Island Total Debris Count")
The problem with this approach is that you don't get the prediction points as well. This can be fixed with the following approach
result
dataframeresult$se = c(predict(mod_gam1,se=T)$se, rep(NA,29))
geom_ribbon
, setting the ymin
and ymax
directlyggplot(result, aes(x = Year, y = data, colour = source, fill=source))+
geom_point()+
geom_ribbon(aes(ymin=data-1.96*se, ymax=data+1.96*se), alpha=0.2) +
labs(x="Year", y = "Bird Island Total Debris Count")+
scale_y_continuous(limits = c(-200,1000))