I am trying to incorporate generalized additive models gam()
from mgcv
package to either xyplot()
function or coplot()
function from lattice
package in R.
The data can be found in http://statweb.stanford.edu/~tibs/ElemStatLearn/, by selecting ozone data.
Here are my code for kernel smoothing.
ozonedata=ozone
Temperature=equal.count(ozonedata$temperature,4,1/2)
Wind=equal.count(ozonedata$wind,4,1/2)
xyplot(ozone^(1/3) ~ radiation | Temperature * Wind, data = ozonedata, as.table = TRUE,
panel = function(x, y, ...) {panel.xyplot(x, y, ...);panel.loess(x, y)},
pch = 20,xlab = "Solar Radiation", ylab = "Ozone (ppb)")
or
coplot((ozone^(1/3))~radiation|temperature*wind,data=ozonedata,number=c(4,4),
panel = function(x, y, ...) panel.smooth(x, y, span = .8, ...),
xlab="Solar radiation (langleys)", ylab="Ozone (cube root ppb)")
The generalized additive models is generated as following.
gam_ozone = gam(ozone^(1/3)~s(radiation)+s(temperature)+s(wind),data=ozonedata,method="GCV.Cp")
Now I am having a trouble combining fitting from gam()
into lattice plots.
I think this should work. Notice that we are passing in the gam_ozone
object as a parameter to xyplot
called gam
. This will allow us to access it in the panel function.
xyplot(ozone^(1/3) ~ radiation | Temperature * Wind, data = ozonedata,
as.table = TRUE, gam=gam_ozone,
panel = function(x, y, gam,...) {
xx <- dimnames(trellis.last.object())
ww <- arrayInd(packet.number(), .dim=sapply(xx, length))
avgtmp <- mean(xx[[1]][[ww[1]]])
avgwind <- mean(xx[[2]][[ww[2]]])
gx<-seq(min(x, na.rm=T), max(x, na.rm=T), length.out=50)
gy<-predict(gam, data.frame(radiation=gx,
temperature=avgtmp, wind=avgwind))
panel.xyplot(x, y, ...);
panel.xyplot(gx, gy, ..., col="red", type="l");
},
pch = 20,xlab = "Solar Radiation", ylab = "Ozone (ppb)"
)
Now in order to use the gam
to predict, you're going to have to find a value to use for wind and temperature for each panel. What i've decide to do, is just take the middle value for each shingle range. So I used a Deepayan-approved, undocumented feature to get at the ranges for each of the current shingles with the call to dimnames
. And then i find the current panel with packet.number()
Once I have the ranges, i take the means to get an average value.
Not i'll use the gam
model we passed in to predict the curve for each panel. I calculate a range of 50 x
values based on observed values and then predict from the gam
a new line.
Finally I draw the raw data with panel.xyplot
and then i draw the gam
predicted line.