Search code examples
rloopsplotsummarylinear-regression

Create function to automatically create plots from summary(fit <- lm( y ~ x1 + x2 +... xn))


I am running the same regression with small alterations of x variables several times. My aim is after having determined the fit and significance of each variable for this linear regression model to view all all major plots. Instead of having to create each plot one by one, I want a function to loop through my variables (x1...xn) from the following list.

fit <-lm( y ~ x1 + x2 +... xn))

The plots I want to create for all x are 1) 'x versus y' for all x in the function above 2) 'x versus predicted y 3) x versus residuals 4) x versus time, where time is not a variable used in the regression but provided in the dataframe the data comes from.

I know how to access the coefficients from fit, however I am not able to use the coefficient names from the summary and reuse them in a function for creating the plots, as the names are characters.

I hope my question has been clearly described and hasn't been asked already.

Thanks!


Solution

  • Create some mock data

    dat <- data.frame(x1=rnorm(100), x2=rnorm(100,4,5), x3=rnorm(100,8,27), 
      x4=rnorm(100,-6,0.1), t=(1:100)+runif(100,-2,2))
    dat <- transform(dat, y=x1+4*x2+3.6*x3+4.7*x4+rnorm(100,3,50))
    

    Make the fit

    fit <- lm(y~x1+x2+x3+x4, data=dat)
    

    Compute the predicted values

    dat$yhat <- predict(fit)
    

    Compute the residuals

    dat$resid <- residuals(fit)
    

    Get a vector of the variable names

    vars <- names(coef(fit))[-1]
    

    A plot can be made using this character representation of the name if you use it to build a string version of a formula and translate that. The four plots are below, and the are wrapped in a loop over all the vars. Additionally, this is surrounded by setting ask to TRUE so that you get a chance to see each plot. Alternatively you arrange multiple plots on the screen, or write them all to files to review later.

    opar <- par(ask=TRUE)
    for (v in vars) {
      plot(as.formula(paste("y~",v)), data=dat)
      plot(as.formula(paste("yhat~",v)), data=dat)
      plot(as.formula(paste("resid~",v)), data=dat)
      plot(as.formula(paste("t~",v)), data=dat)
    }
    par(opar)