Regression loops

Ciao, I have several columns that represents scores. I want to estimate models where each SCORE is a function of STUDYTIME. So I want to run as many models as there are SCORE columns all simple models that are functions of STUDYTIME. Then I want to store the coefficients of STUDYTIME in a new column that has rownames equal to the SCORE column name. And last of all I am not sure of how to do clustering on the linear models because STUDENTS are each in the data two times.

Here is my replicating example. This is the data I have now:

df <- data.frame(replicate(5, rnorm(10)))
df[1]<-c(1,1,2,2,3,3,4,4,5,5)
colnames(df) <- c('student','studytime', 'score1','score2','score3')

This is my attempt at the coding:

for (i in 1:nrow(df)) {
  dfx         <- df[,i]
  lm    <- lm(dfx[,3:5] ~ study_time)
  resdat[,i] = summary(lm)$coefficients[2]
}

Solution

You can do this using simply lapply and sapply function.

Here is the r code:

Generating Data

df <- data.frame(replicate(5, rnorm(10)))
df[1]<-c(1,1,2,2,3,3,4,4,5,5)
colnames(df) <- c('student','studytime', 'score1','score2','score3')

Storing Results

Results <- lapply(df[, -c(1,2)], FUN = function(x) lm(x ~ df$studytime))
Coef <- sapply(Results, FUN = coefficients)