Search code examples
rregressionsample

100 samples of 20 from the dataset and drawing regression lines along with population regression line


I have a datasetwith two variables hours studied and grade. I would like to take some 100 samples of 20 each from this data set and show 100 regression lines along with the original regression line. Any suggestions?

library(ggplot2)
#> Warning: package 'ggplot2' was built under R version 3.6.3
grades = read.csv("https://www.dropbox.com/s/me6wiww943hzddj/grades.csv?dl=1")
qplot(hours, grade, data = grades, geom = "point") + geom_smooth(method = lm)
#> `geom_smooth()` using formula 'y ~ x'


Solution

  • Using a loop:

    n=100
    for(i in 1:n){
      df = grades[sample(1:nrow(grades), 20),]
      g = g + geom_smooth(method = lm, data=df, color="red", size=0.5, alpha = 0)
    }
    plot(g)
    

    Output:

    enter image description here

    I encourage you to mess with the aesthetics of it, adding a dashed line for example:

    enter image description here