I've begun by mastering how to use splines to interpolate 1-dimentional function.
model = spline(bdp[,4]~bdp[,1])
I could then use
predict(model, c(0))
to predict function value in point 0.
Then I've searched the Internet to find something to spline 3-dimentional data and I came across an answer on stackoverflow suggesting that mgcv::gam is the best choice.
And so I tried:
model=gam(bdp[,4]~s(bdp[,1],bdp[,2],bdp[,3]))
and then I did:
predict(model, newdata=c(0,0,0), type="response")
hoping that it will return a value of spline interpolation for point (0,0,0). It calculated for a while and returned lots of multidimentional data that I could not understand.
I must be doing something wrong. What do I do to receive a value for a single point from gam object? And, just to be sure, can you agree/disagree that gam is the right choice to interpolate splines for 3D data or would you suggest something else?
I'm adding a reproducible example.
This is a data file (please unpack in c:/r/) https://www.sendspace.com/file/b4mazl
# install.packages("mgcv")
library(mgcv)
bdp = read.table("c:/r/temp_bdp.csv")
bdg=gam(bdp[,4]~s(bdp[,1],bdp[,2],bdp[,3]))
#this returns lots of data, not just function value that I wanted.
predict(bdg, newdata=data.frame(0,0,0,0), type="response")
Minimal reproducible example:
tmp = t(matrix(runif(4*200),4))
tmpgam=gam(tmp[,4]~s(tmp[,1],tmp[,2],tmp[,3]))
predict(tmpgam, newdata=data.frame(0,0,0,0), type="response")
For predict(bdg, newdata=data.frame(0,0,0,0), type="response")
it returns a lot of numbers any warns that newdata didn't have enough data
for
predict(bdg, c(0,0,0,0), type="response")
it returns nothing and also warns about the same.
So with nearly all types of models you fit, if you plan to use the predict
function, it's best to use a "proper" formula with column names rather than using matrix/data.frame slices. The reason is that when predict runs, it matches the values in newdata
to the model using the names in both so they should match identically. When you index the data.frame like that, it create weird names in the model. Do the best way to fit the model and predict is
bdg <- gam(V4~s(V1,V2,V3), data=bdp)
predict(bdg, newdata=data.frame(V1=0, V2=0, V3=0))
# 1
# 85431440244
That's assuming
names(bdp)
# [1] "V1" "V2" "V3" "V4"
So here we fit with "V1","V2","V3" and newdata
has columns "V1","V2" and "V3"
So i've only focused on the R-coding part. As far as the question if this is an appropriate analysis is better fitted for https://stats.stackexchange.com/