Search code examples
rggplot2lmp-value

ggplot2: Add the p-value, Rsq and slope for multiple columns


Let's say I have this data frame:

library(ggplot2)
Y <- rnorm(100)
df <- data.frame(A = rnorm(100), B = runif(100), C = rlnorm(100),
                 Y = Y)
colNames <- names(df)[1:3]
for(i in colNames){
  plt <- ggplot(df, aes_string(x=i, y = Y)) +
    geom_point(color="#B20000", size=4, alpha=0.5) +
    geom_hline(yintercept=0, size=0.06, color="black") + 
    geom_smooth(method=lm, alpha=0.25, color="black", fill="black")
  print(plt)
  Sys.sleep(2)
}

I want to do a lm model and display for each column the adjusted Rsq, Intercept, Slope and p-value. I found an example bellow

data(iris)
ggplotRegression <- function (fit) {

require(ggplot2)

ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
  geom_point() +
  stat_smooth(method = "lm", col = "red") +
  labs(title = paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                     "Intercept =",signif(fit$coef[[1]],5 ),
                     " Slope =",signif(fit$coef[[2]], 5),
                     " P =",signif(summary(fit)$coef[2,4], 5)))
}

fit1 <- lm(Sepal.Length ~ Petal.Width, data = iris)
ggplotRegression(fit1)

But it's working only for one column. (I took the examples from this question) and this one over here)

Thanks!


Solution

  • Building on the comment above you can put the fit inside the function and then loop through with lapply.

    library(ggplot2)
    
    Y <- rnorm(100)
    df <- data.frame(A = rnorm(100), B = runif(100), C = rlnorm(100),
                     Y = Y)
    colNames <- names(df)[1:3]
    
    
    plot_ls <- lapply(colNames, function(x){
    
    
      fit <- lm(Y ~ df[[x]], data = df)
      ggplot(fit$model, aes_string(x = names(fit$model)[2], y = names(fit$model)[1])) + 
        geom_point() +
        scale_x_continuous(x)+
        stat_smooth(method = "lm", col = "red") +
        ggtitle(paste("Adj R2 = ",signif(summary(fit)$adj.r.squared, 5),
                           "Intercept =",signif(fit$coef[[1]],5 ),
                           " Slope =",signif(fit$coef[[2]], 5),
                           " P =",signif(summary(fit)$coef[2,4], 5))
                )
    })
    
    gridExtra::grid.arrange(plot_ls[[1]],plot_ls[[2]],plot_ls[[3]])
    

    enter image description here