Search code examples
rstatistics-bootstrap

Storing coefficients and vectors using the boot package in r


I am estimating a model of the kind y= x + s(z) where s(z) is a non parametric function. I want to use bootstrap to get the standard error for the coefficient on x and the confidence bands for the function s(z). Basically the result of my estimation gives a coefficient for x, therefore a 1x1 object, and a vector nx1 for s(z). You can do that by using the gam package (gam) function. For my needs I am using a hand-written function which returns a list named resultfrom which I have result$betaxas the coefficient on x, and result$curve which stores the vector values (the estimation of s(z) gives a set of values corresponding to a curve). I am bootstrapping using the boot package as follows

result.boot <- boot(data, myfunction, R=3, sim = "parametric", 
                    ran.gen = myfunction.sim, mle = myfunction.mle)

I obtain the following error message

Error in boot(pdata, myfunction, R = 3, sim = "parametric", 
ran.gen = myfunction.sim, : incorrect number of subscripts on matrix

I guess it should instead gives out a vector of cofficient on x, on which I will compute the standard error, and a matrix nxnof s(z) values, on which I will compute a s.e. for each row, allowing me to have confidence interval for the s(z) curve. I suppose this is related to the fact that the output of my function is given by

est <- list("betax" = betax, "curve" = s.z, "residuals"=res)
return(est)

How coul I solve this? To reproduce the issue it is possible to use the gamfunction

y = runif(16, min=0, max=1)
x = runif(16, min=0, max=0.5)
z = runif(16, min=0, max=0.3)
require(gam)
est <- gam(y ~ x + s(z))

while I am doing

est <- myfunction(y, x, z)

Solution

  • The solution implies vectorizing the output of the hand-written function and, therefore, making it compatible with the boot procedure which requires results to be stored in a vector.

    est <- myfunction(y, x, z)
    good.output <- matrix(c(betax, s.z), ncol=1)
    

    This will let the boot function working properly. Then you just extract the corresponding elements of result.boot$t and you compute the statistics you like