I'm working on creating a model that examines the effect of ocean characteristics on fishing outcomes. I have spatial data on a 0.5 degree grid and I created the following model:
gam(inverse hyperbolic sine(yvar) ~ s(lat, lon, bs="sos) + s(xvar1) +
s(xvar2) + s(xvar3), data = dat, method = "REML"
The QQ plot and histogram of residuals look okay. However, gam.check() produces an odd pattern in the residuals plot. I know that the points should be scattered around 0, but I have a very odd pattern in the residuals. Can anyone provide some insight on the interpretation of this plot:
Those will be either all the 0s (most likely) or 1s/smallest value in your original data. You don’t say what these data are but as you mentioning fishing outcomes it is highly likely that these have some natural lower bound and this line in the residuals are all the observations that take this lower bound (before transformation).
As you don’t exactly what your data are it is difficult to comment further as to how to proceed (this may not be an issue or you may need to not use the transform that you did, and instead use a GLM or other non-Gaussian response), but