Search code examples
rggplot2linear-regressionleast-squares

Add OLS squares to ggplot2 graph


I'm trying to plot a graph with points, OLS line and I want to add a square to every point to explain how OLS work. I am doing so using for loop which is not the best in the first place, but it is what it is. My code is as follows:

library(ggplot2)

set.seed(0)

x = sort(rnorm(20, 5, 5))
y = sort(rnorm(20, 5, 10))
lmod = lm(y ~ x)

subor <- data.frame(prva = x, druha = y)

a = ggplot(subor, aes_string(x = x, y = y)) + 
  geom_point() +
  geom_abline(slope = lmod$coefficients[2], intercept = lmod$coefficients[1]) +
  coord_fixed()



for(i in 1:20) {
  a = a + geom_rect(aes(xmin = ifelse(y[i] > lmod$fitted.values[i], x[i] - (y[i] - lmod$fitted.values[i]), x[i]), 
                        ymin = min(y[i], lmod$fitted.values[i]), ymax = max(y[i], lmod$fitted.values[i]), 
                        xmax = ifelse(y[i] > lmod$fitted.values[i], x[i], x[i] + (lmod$fitted.values[i] - y[i]))))
}

a

But instead of getting all the squares I only get the last one.

enter image description here


Solution

  • You don't really need a loop here. Use the vectorized pmin and pmax functions to work out the minimum and maximum edges of your squares from the fitted values and residuals:

    a + geom_rect(alpha = 0.2, col = 'gray60',
                  aes(ymin = pmin(druha, fitted(lmod)), 
                      ymax = pmax(druha, fitted(lmod)),
                      xmin = pmin(prva, prva - resid(lmod)),
                      xmax = pmax(prva, prva - resid(lmod))))
    

    enter image description here