Search code examples
rggplot2regression

How to correctly extract the regression equation from `stat_smooth` or a regression model in R


I plot the equation extracted from stat_smooth, but it mismatches the origin one.

I tried draw a regression line with stat_smooth, and again draw the formula of regression use stat_function, but the 2 curves are obviously different. What I did wrong?

I can't post pictures yet, so I provide an example with ggpmisc package:

ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point() +
  ggpmisc::stat_poly_line(formula = y ~ poly(x, degree = 2)) +
  ggpmisc::stat_poly_eq(aes(label = after_stat(eq.label)), 
                        formula = y ~ poly(x, degree = 2))
ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point() +
  ggpmisc::stat_poly_line(formula = y ~ poly(x, degree = 2)) +
  ggpmisc::stat_poly_eq(aes(label = after_stat(eq.label)), 
                        formula = y ~ poly(x, degree = 2)) +
  stat_function(fun = \(x) 231 - 585 * x + 190 * x ^ 2)

Solution

  • Your x value is transformed into an orthogonal polynomial when you use poly(x, 2).

    When you use x in stat_function, your x value is the original x values and not your polynomial x values.

    Here's a workaround:

    # Compute your polynomial a priori
    polynomial <- poly(mtcars$mpg, 2)
    
    # Compute your equation using the polynomials
    mtcars$newY <- 231 - 585 * polynomial[,"1"] + 190 * polynomial[,"2"]
    
    # Plot
    ggplot(mtcars, aes(x = mpg, y = disp)) + geom_point() +
      ggpmisc::stat_poly_line(formula = y ~ poly(x, degree = 2)) +
      ggpmisc::stat_poly_eq(aes(label = after_stat(eq.label)), 
                            formula = y ~ poly(x, degree = 2)) +
      geom_line(data = mtcars, aes(x = mpg, y = newY))