Search code examples
rlmp-value

How to get a collection of p-values for linear regression?


I have a data of 131 columns. The first column is my Y. I have 130 Xs. I want to have 130 linear regressions which are lm(y ~ x1), lm(y ~ x2), lm(y ~ x3 ) ....lm(y ~x130). Then get the p-value of every of these fit. How can I make it faster? for loop or apply?


Solution

  • Using base R only this can be done with a series of *apply instructions.

    First, I will make up some data since you have posted none.

    set.seed(7637)    # Make the results reproducible
    
    n <- 100
    dat <- as.data.frame(replicate(11, rnorm(n)))
    names(dat) <- c("Y", paste0("X", 1:10))
    

    Now, for the regressions.

    lm_list <- lapply(dat[-1], function(x) lm(Y ~ x, dat))
    lm_smry <- lapply(lm_list, summary)
    lm_pval <- sapply(lm_smry, function(x) x$coefficients[, "Pr(>|t|)"])