After running a multiple regression in R, the regression summary indicates the significant variables with stars. In a dataset that I am working on there are nearly 2000 variables and the significant variables identified by R includes more than 50 variables. Is there some way I can get the list of the significant variables alone, from the regression summary.
This is an example of why you should not be doing what you ask us to do:
randf <- as.data.frame(matrix(rnorm(800*400), 800, 400))
names(randf)[1] <- "Y"
big.mod <- lm(Y ~ ., data=randf)
sum( summary(big.mod)$coefficients[ ,4] < 0.05 )
#[1] 22
So we get 22 significant coefficients (some of them "highly significant") just regressing 400 random variables against another random variable.