I currently have the below code. The function works when running outside of the boot() function, but when using the boot() function, it provides the error
Error in t.star[r, ] <- res[[r]] : number of items to replace is not a multiple of replacement length.
When I use the boot() function, lower values of R allow the function to run properly. Is there something I need to add to my function to ensure that I do not continue to receive this error?
alassoOLS_ydot_n10_fn <- function(data,index){ #index is the bootstrap sample index
x <- data[index,-1]
y <- data[index,1]
cv.out <- cv.glmnet(x,y,alpha=1,nfolds=10, penalty.factor = 1 / abs(best_ridge_coef.ydot.n10)) #alpha=1, lasso
bestlam <- cv.out$lambda.min #the best lambda chosen by CV
lasso.mod <- glmnet(x,y,alpha=1,lambda=bestlam, penalty.factor = 1 / abs(best_ridge_coef.ydot.n10))
coef <- as.vector(coef(lasso.mod))[-1]
coef_nonzero <- coef != 0
ls.obj <- lm(y ~x[, coef_nonzero, drop = FALSE])
ls_coef <- (ls.obj$coefficients)[-1]
return(ls_coef)
}
boot(ydot_matrix_n10,alassoOLS_ydot_n10_fn,R=500)
The length of the vector returned by alassoOLS_ydot_n10_fn
is not constant but depends on the number of variables selected by glmnet
.
I modified your function as follows:
alassoOLS_ydot_n10_fn <- function(data,index){
x <- data[index,-1]
y <- data[index,1]
cv.out <- cv.glmnet(x,y,alpha=1,nfolds=10)
bestlam <- cv.out$lambda.min #the best lambda chosen by CV
lasso.mod <- glmnet(x,y,alpha=1,lambda=bestlam, penalty.factor = 1/abs(best_ridge_coef.ydot.n10))
coef <- as.vector(coef(lasso.mod))[-1]
coef_nonzero <- coef != 0
ls.obj <- lm(y ~x[, coef_nonzero, drop = FALSE])
ls_coef <- (ls.obj$coefficients)[-1]
# Generate a fixed-length vector fo OLS coefficients
# The coefficients of variables not selected by glmnet were set to zero.
vect_coef <- rep(0,length(coef_nonzero))
vect_coef[coef_nonzero] <- ls_coef
return(vect_coef)
}
Now the output is a fixed-length vector of coefficients.
I set to 0 the coefficients of the covariates not selected by glmnet.
(I don't know if this is correct from a statistical point of view in your investigation.
My aim is only to show the source of the error message given by boot
.)
Now boot
works with no errors. See the following example.
set.seed(1)
ydot_matrix_n10 <- matrix(runif(1000), ncol=10)
best_ridge_coef.ydot.n10 <- 10
boot(ydot_matrix_n10,alassoOLS_ydot_n10_fn,R=50)
The output is reported below.
ORDINARY NONPARAMETRIC BOOTSTRAP
Bootstrap Statistics :
original bias std. error
t1* 0.00000000 -0.002330197 0.02319543
t2* 0.13530886 -0.001906712 0.09889174
t3* -0.19509877 -0.013020365 0.07251921
t4* -0.01954785 0.015227018 0.09184750
t5* 0.05600451 0.008896392 0.08729263
t6* 0.12978757 -0.013795860 0.11320119
t7* 0.06525111 -0.007208380 0.09703813
t8* 0.09368079 -0.017343037 0.08947958
t9* -0.09518469 -0.003352512 0.08575450