I am trying to do a HGAM in order to separate my large dataset by Reservoir. One of my predictor variables is Season_Y, indicating the season and year it the sample was taken. Here is my code:
model = gam(
log1p(GEO) ~ s(Season_Y, by=Res, k=, bs ="fs") +
t2(TP, TIN.TP, by=Res, bs="fs", k=5) +
s(Sil, by=Res, k=5) +
s(SO4, by=Res, k=5) +
s(Dfe, by=Res, k=5) +
s(Ortho, by=Res, k=5) +
s(NO3.NH3, by=Res, k=5),
data=gam.data,
family=Gamma,
method="REML"
)
Normally in a GLM I get individual scores for each subsequent Season_Y, e.g. Winter_2019. but I only an overall Season_Y score. As displayed below:
edf Ref.df F p-value
s(Season_Y):ResAlaw 4.603e+00 6.000 3.091e+02 0.99783
s(Season_Y):ResAlwen 2.474e+00 5.000 2.085e+03 0.99962
s(Season_Y):ResCefni 2.600e+00 7.000 1.168e+03 0.99809
s(Season_Y):ResDolwen 6.863e-05 1.000 4.220e+07 0.99803
s(Season_Y):ResLlandegfedd 6.795e-05 1.000 1.219e+07 0.99856
s(Season_Y):ResLlwyn Onn 4.350e+00 6.000 3.917e+01 0.99830
s(Season_Y):ResPentwyn 1.267e-04 2.000 4.130e+05 0.99996
s(Season_Y):ResPlas Uchaf 4.798e-02 5.000 2.801e+04 0.99768
s(Season_Y):ResPontsticill 3.182e+00 7.000 1.672e+01 0.99950
When I do gam.check
I don't get any values:
k' edf k-index p-value
s(Season_Y):ResAlaw 7.00e+00 4.60e+00 NA NA
s(Season_Y):ResAlwen 7.00e+00 2.47e+00 NA NA
s(Season_Y):ResCefni 7.00e+00 2.60e+00 NA NA
s(Season_Y):ResDolwen 7.00e+00 6.86e-05 NA NA
s(Season_Y):ResLlandegfedd 7.00e+00 6.79e-05 NA NA
s(Season_Y):ResLlwyn Onn 7.00e+00 4.35e+00 NA NA
s(Season_Y):ResPentwyn 7.00e+00 1.27e-04 NA NA
s(Season_Y):ResPlas Uchaf 7.00e+00 4.80e-02 NA NA
s(Season_Y):ResPontsticill 7.00e+00 3.18e+00 NA NA
Just want a breakdown of how each season from each year is significant, please?
Any help would be much appreciated.
Thank you!
If Season_Y
really contains data like "Winter_2019"
you are using the fs
basis incorrectly. This basis is the equivalent of a random slope, but instead of a linear effect we get a smooth effect for each level of the grouping variable. The critical thing here is that you are producing a smooth (lots of them) so you have to pass it a continuous variable and Season_Y
isn't continuous. If this is even doing anything sensible, it is just creating a random effect (intercept) and hence it would be much clearer in terms of the intention of the code to use:
s(Season_Y, by = Res, bs = "re")
As to why everything is NA
in the basis dimension check, you asked for random effects and the basis dimension check is not available for those terms. From ?check.k
:
Currently smooths of factor variables are not supported and will give an NA p-value.
I think you are genuinely confusing outputs here. The first output block you show is the output from summary.gam()
. The second output is from gam.check()
(or k.check()
) and these are not the same thing at all.