Search code examples
rregressiondata-modelinglinear-regressionmodeling

Input distance from data point to regression (using residuals?) into a new dataframe column


I want to be able to calculate the distance of each individual data point on my plot from my linear regression on that plot, and then store the distances as a new variable (column) in my original dataframe. Based on this answer, that distance value can be found using the residuals of the linear regression. However, I do not know how to apply that to each individual point, and I do not know how to then store the values within the dataframe (if that is even possible).

I created some example data...

ex.age <- c(50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70)
ex.score <- c(10,9,9,10,8,7,9,6,8,7,6,8,6,5,6,4,5,6,3,5,3)
ex.df <- data.frame(ex.age,ex.score)

When graphed, it looks like this... graph

I want to then be able to calculate the distance from each point to the regression line, and then store that in a new column, ex.df$reg.dev.

How would I be able to accomplish that?

Thank you.


Solution

  • All you need to do is grab the residuals from the lm().

    ex.age <- c(50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70)
    ex.score <- c(10,9,9,10,8,7,9,6,8,7,6,8,6,5,6,4,5,6,3,5,3)
    ex.df <- data.frame(ex.age,ex.score)
    
    ex_model <- lm(ex.score ~ ex.age, data = ex.df)
    ex.df$reg.dev <- ex_model$residuals