Search code examples
rplotlinear-regressionp-value

How to plot the results of many regressions in a loop?


I have a for loop in my code that runs regression on each variable of mtcars dataset and gives me r-squared and p-value. How can I plot or visualize these results to compare the variables and see which variable is the most significant one. Here are the results:

var          rsquared               pvalue
d2   cyl 0.726180005093805 6.11268714258098e-10
d3  disp  0.71834334048973  9.3803265373815e-10
d4    hp 0.602437341423934 1.78783525412108e-07
d5  drat 0.463995167985087 1.77623992875241e-05
d6    wt 0.752832793658264 1.29395870135053e-10
d7  qsec 0.175296320261013   0.0170819884965196
d8    vs 0.440947686116142 3.41593725441993e-05
d9    am 0.359798943425465 0.000285020743935068
d10 gear  0.23067344813203  0.00540094822470767
d11 carb  0.30351843705443  0.00108444622049167


Solution

  • Use geom_text in ggplot2 and color to distinguish the significance

    library(ggplot2) 
    library(ggrepel)
    
    ggplot(data, aes(x = pvalue, y = rsquared, label = var, color = pvalue<0.05)) + 
      geom_point(size = 1.5) +
      geom_text_repel(show.legend = FALSE) + 
      scale_color_manual(values = c("TRUE" = "blue", "FALSE" = "green"),
                         labels = c("TRUE" = "Significant", "FALSE" = "Insignificant")) +
      labs(color='Significance (p.value < 0.5)') +
      theme_classic()
    

    enter image description here