I am very new to spatial evaluation and come from psychology.
I am using the software R and the packages "gstat" and "spacetime".
I would like to do a spatio temporal interpolation. For this I follow the paper of Gräler et al. (https://cran.r-project.org/web/packages/gstat/vignettes/spatio-temporal-kriging.pdf)
Unfortunately I can't find/fit the right variogram model. I can create the empirical variogram and this is also conclusive to me, but then I do not get any further. I do not understand how to define the individual parameters such as "sill" or "nugget" or what they stand for.
Here are my previous approaches:
My ST-Dataframe:
> df.stf
An object of class "STFDF"
Slot "data":
# A tibble: 2,406 x 6
KRS month DeprIndex MMW10 MMW25 NewInfect
<dbl> <int> <dbl> <dbl> <dbl> <dbl>
1 1001 1 -4.08 NA NA NA
2 1002 1 -2.28 13.3 NA NA
3 1003 1 -3.29 17.8 11.7 NA
4 1004 1 -4.31 NA NA NA
5 1051 1 -2.51 17.7 NA NA
6 1053 1 -1.07 13.0 9.93 NA
7 1054 1 -0.863 NA NA NA
8 1055 1 -1.63 NA NA NA
9 1056 1 0.887 NA NA NA
10 1057 1 -1.21 14.7 8.84 NA
# ... with 2,396 more rows
Slot "sp":
class : SpatialPolygonsDataFrame
features : 401
extent : 3280359, 3921536, 5237511, 6103443 (xmin, xmax, ymin, ymax)
crs : +proj=tmerc +lat_0=0 +lon_0=9 +k=1 +x_0=3500000 +y_0=0 +ellps=bessel +towgs84=598.1,73.7,418.2,0.202,0.045,-2.455,6.7 +units=m +no_defs
variables : 1
names : AGS
min values : 1001
max values : 16077
Slot "time":
timeIndex
0001-01-30 1
0002-01-30 2
0003-01-30 3
0004-01-30 4
0005-01-30 5
0006-01-30 6
Slot "endTime":
[1] "0002-01-30 CET" "0003-01-30 CET" "0004-01-30 CET" "0005-01-30 CET" "0006-01-30 CET" "0007-01-30 CET"
So, I have 401 german counties and the mean monthly PM10 value over six month.
My empirical variogram:
eVgmPm10<-variogramST(MMW10~1,df.stf,tlags = 0:5)
eVgmPm10$dist<-eVgmPm10$dist/1000
eVgmPm10$avgDist <- eVgmPm10$avgDist/1000
eVgmPm10$spacelag <- eVgmPm10$spacelag/1000
plot(eVgmPm10, map=F)
So far everything works as it should (I guess)
Now I wanted to make as in the paper explained a seperable model:
separableModel <- vgmST("separable",
space = vgm(0.9, "Exp", 200, 0.1),
time = vgm(0.9, "Sph", 3.5, 0.1),
sill = 124)
I understand, that I have to create a variogram for space and time. But I do not know how to define the parameters correctly (psill, model, range, nugget) for the variograms. If I use the same parameters as in the paper I get folowing plot:
As you can see, in this model, I just have two lags and the variogram looks realy strange. So, I think it is because of the wrong parameter for my model. I also tested another approach like in the paper.
sumMetricModel <- vgmST("sumMetric",
space = vgm(20, "Sph", 150, 1),
time = vgm(10, "Exp", 2, 0.5),
joint = vgm(80, "Sph", 1500, 2.5),
stAni = 120)
Again, I do not have any idea and clue how to set the parameters so I took the same as in the paper and plot all three.
I am pretty sure it depend on the parameters but I do really not know how to find the correct variogram model to fit.
UPDATE AFTER FIRST ANSWER:
As suggested I used fewer timelage for the new empirical variogram:
eVgmPm10<-variogramST(MMW10~1,df.stf,tlags = 0:2)
As you can see I got stil 3 time steps starting with "lag 0".
Than I tried to fill the model parameters as suggested.
separableModel<-vgmST("separable",
space=vgm(psill = 15,model = "Exp", range =250 ,nugget = 5),
time = vgm(psill = 15,model = "Exp", range =1 ,nugget = 0),
sill = 15)
Here is the wireframe of both:
I really frustrated and I am really sorry for my lack of knowledge, but I really try hard to understand it and I am thankfull for every help
The initial parameters of sill
, range
and nugget
can be read from the empirical spatiotemporal variogram. Produce a 3D wireframe plot of your empirical variogram by:
plot(eVgmPm10, wireframe=T, scales=list(arrows=F))
The nugget is about the approximate intersection of the surface with the z-axis (this amounts to small scale variability that the model cannot explain). The sill is about the maximum z-value your empirical surface approaches and the range is about the distance where the surface flattens out. You will have a spatial (x-axis) and temporal (y-axis) range.
Different models interpret these values slightly differently; details are given in Section 2.1 of the gstat vignette which you reference in your question. In the separable case, e.g., the spatial and temporal variogram components are normalised to a joint sill (sill + nugget) of 1 and the model is "stretched" by the sill parameter in the model definition. In the sum-metric models, this normalisation does not apply. For the metric components, a spatiotemporal anisotropy has to be fitted in order to relate spatial and temporal distances in a single model (check gstat::estiStAni()
for that). I recommend to run demo("stkrige")
in R and to experiment with the parameters to get a better feeling of their influence. Note that the parameters affect each other as well which sometimes makes the interpretation and numerical optimisation troublesome.
Based on what you provided in your question, a nugget of 5, sill of 15 and spatial range of 250 might be values to start with. Furthermore, it appears as if your empirical surface moves in waves along the temporal domain. This might be an artefact of the temporal coverage of your data set. 6 time steps (months in your case) are very few to estimate any temporal dependence, in especially beyond 1 (or maybe 2) steps. Your plot indicates that you are looking at 5 time steps, where "lag5" will only be based on the values of the first and last month. When estimating your empirical variogram consider using fewer temporal lags:
eVgmPm10<-variogramST(MMW10~1, df.stf, tlags = 0:2)
Referring to your question's title "the correct variogram", it will be very hard to identify what that is. You can only find a model which will be a "good" approximation of the dependence structure in the data that you have provided to the algorithm. The dependence might for instance change between winter and summer months and there might be seasonal components in the mean which also need to be reflected in the model.