I have some data of longitudes and latitudes. My third variable is the penetration of the electric Vehicle in an municipality. Hence, I have sparse datas and I do not know the mapping from f(long,lat) -> MS_Year. I have the following datas
long lat MS_Year
<dbl> <dbl> <dbl>
1 -66.0436169857389 50.3417726256247 0.0122
2 -66.1704063635085 48.168838536499 0.0115
3 -67.1376617834163 48.9202603958534 0.0136
4 -67.474931686395 48.8025438021711 0.0108
5 -67.5756670981796 48.5194066352801 0.0111
6 -67.6273066949175 48.429540936994 0.0167
I have been able to do the 3D scatter plot without any problems.
However, I've spent the whole day trying to understand how to do a surface. To my understanding, it is particularly hard, because I need to use a nonparametric estimator to show how complex the topology is. (The idea is to justify a nonparametric regression, which I've just learned about and never used; it might explain my total struggle).
Hence, I need to create a polynomial function f(long,lat) that has output MS_Year.
I Tried to applied it as follow :
fit5=lm(MS_Year ~polym(long, lat,degree=5, raw=T),data=Plot_Me_Tot_2019_grouped)
I did that, caused It combines this [polynomial regression][2], to this [3D plotting][3]. It's a total failure.
Did someone ever faced similar issues ?
I feel my problem is to create the linked function AKA the f(long,lat) and then with this, use expand.grid(long,lang) to create a surface and plot it.
One should understand that I do not posses a good understanding of the translation from the DF to the matrix format required for the 3D surface.
Thanks a lot for your time
I think you don't want a polynomial for the whole surface: that's likely to be very unstable, with huge amounts of variation between the points.
However, you might want a low degree local polynomial fit, or some low degree interpolation.
You haven't posted your real data, so I'll demonstrate with fake data. First, we do interpolation between the points:
df <- data.frame(long = rnorm(100, -66, 1),
lat = rnorm(100, 49, 1))
df$MS_Year <- 0.015 + df$long/1000 + df$lat/1000 + rnorm(100, 0.01, 0.0005)
#> long lat MS_Year
#> 1 -66.56048 48.28959 0.007828523
#> 2 -66.23018 49.25688 0.008682913
#> 3 -64.44129 48.75331 0.009179444
#> 4 -65.92949 48.65246 0.007994563
#> 5 -65.87071 48.04838 0.006970499
#> 6 -64.28494 48.95497 0.009431914
surf <- interp(df$long, df$lat, df$MS_Year,
xo = sort(df$long), yo = sort(df$lat))
plot3d(df, type = "s", size = 0.5)
persp3d(surf, col = "gray", add = TRUE)
This did bilinear interpolation between the points; it ends up very rough. You'll probably prefer to fit some sort of surface to the points rather than interpolate them. This fits a local smooth:
#> Loading required package: nlme
#> This is mgcv 1.8-38. For overview type 'help("mgcv-package")'.
fit <- gam(MS_Year ~ s(long, lat), data = df)
xo <- sort(df$long)
yo <- sort(df$lat)
grid <- expand.grid(long = xo, lat = yo)
pred <- predict(fit, newdata = grid)
plot3d(df, type = "s", size = 0.5)
persp3d(xo, yo, matrix(pred, 100,100), col = "gray", add = TRUE)
Created on 2022-01-23 by the reprex package (v2.0.1)
That's the same dataset, but the smoother managed to see that it's more or less linear in both long
and lat
. Your data probably won't end up with such a simple shape.