Search code examples
rlinear-regression

Fast adjusted r-squared extraction


.lm.fit is considerably faster than lm for reasons documented in several places, but it is not as straight forward to get an adjusted r-squared value so I'm hoping for some help.

Using lm() and then summary() to get the adjusted r-squared.

tstlm <- lm(cyl ~ hp + wt, data = mtcars)

summary(tstlm)$adj.r.squared

Using .lm.fit

mtmatrix <- as.matrix(mtcars)

tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])

And here I'm stuck. I suspect the information I need to calculate adjusted r-squared is found in the .lm.fit model somewhere but I can't quite figure out how to proceed.

Thanks in advance for any suggestions.


Solution

  • The following function computes the adjusted R2 from an object returned by .lm.fit and the response vector y.

    adj_r2_lmfit <- function(object, y){
      ypred <- y - resid(object)
      mss <- sum((ypred - mean(ypred))^2)
      rss <- sum(resid(object)^2)
      rdf <- length(resid(object)) - object$rank
      r.squared <- mss/(mss + rss)
      adj.r.squared <- 1 - (1 - r.squared)*(NROW(y) - 1)/rdf
      adj.r.squared
    }
    
    tstlm <- lm(cyl ~ hp + wt, data = mtcars)
    tstlmf <- .lm.fit(cbind(1,mtmatrix [,c("hp","wt")]), mtmatrix [,"cyl"])
    
    summary(tstlm)$adj.r.squared
    #[1] 0.7753073
    adj_r2_lmfit(tstlmf, mtmatrix [,"cyl"])
    #[1] 0.7753073