I am trying to write an algorithm which does the following in R:
dat
use step
function to perform glm
model selection of j covariates from a set of J candidate variablescall
of j variates and compare with full vector J. Write outcome into a 1xJ vector, where 1 indicates variable is in final call
and 0 elsewise.Example:
In the following example three variables (x,y,z
) are candidates for prediction of variable dep
. Step
is used for variable selection. My goal is to finally have a vector indicating which of the input variables ends up in the final model, so here, c(1,0,1)
.
n=1000
x <- rnorm(n,0,1)
y <- rnorm(n,0,1)
z <- rnorm(n,0,1)
dep <- 1 + 2 * x + 3* z + rnorm(n,0,1)
m<-step(lm(dep~x+y+z),direction="backward")
I have difficulties extracting the variable names from the final m$call
and creating the vector.
I think this does it:
n=1000
x <- rnorm(n,0,1)
y <- rnorm(n,0,1)
z <- rnorm(n,0,1)
dep <- 1 + 2*x + 3*z + rnorm(n,0,1)
m<-step(lm(dep~x+y+z),direction="backward")
matt <- attributes(m$terms)
matt$term.labels
#[1] "x" "z"
v <- c("x","y","z")
as.integer(v %in% matt$term.labels)
#[1] 1 0 1