Search code examples
rggplot2regressionlmsummary

coefficient extraction from second order polynomial in ggplot2


What I have been researching and have not found a solution is to extract the coefficients from the second order polynomial that is being produced. I don't necessarily need all the coefficients from y = A + Bx Cx^2 (mainly just need C), but I am unsure how to access the information from the variable plot_5. I have read some stuff on summary, str, lm, and posts where user say that this task has not been implemented into ggplot2 because ggplot2 is for visualization. What I would like to do is create a new variable in Infil_Data2 ("coef") and repeat the coefficient value from the specific Site ID.

For example, the dataframe below would add a new variable named "coef", with the first 11 rows as 0.00854, then the next 11 rows as 0.0154, and finally with the remaining 11 rows as 0.00839. The row numbers are not all that important because the data will be grouped accordingly.

    library(purrr)
    library(tidyverse)
    library(ggpmisc)

    plot_5 <-
      Infil_Data2 %>% 
      split(.$Site_ID) %>% 
      map2(names(.),
           ~ggplot(.x, aes(Sqrt_Time.x, Cal_Vol_cm)) + 
             geom_point() +
             labs(title = paste(.y)) +
             theme(plot.title = element_text(hjust = 0.5)) + 
             stat_smooth(mapping = aes(x = Sqrt_Time.x, y = Cal_Vol_cm?),
                         method = "lm", se = FALSE, 
                         formula = y ~ poly(x, 2, raw = TRUE),
                         #check raw = TRUE, some say raw = FALSE gives a better fit
                         color = "red") +
             theme(plot.margin = unit(c(1, 5, 1, 1), "cm")) +
             stat_poly_eq(aes(label = paste(..eq.label.., ..rr.label..,         sep = "~~~")),
                          label.x.npc = "left", label.y.npc = 0.90, #set the position of the eq
                          formula = y ~ poly(x, 2, raw = TRUE), parse = TRUE, rr.digits = 3))

    pdf("allplots5.pdf", onefile = TRUE)
    walk(plot_5, print)
    dev.off()

    Infil_Data2 <-
        structure(list(Time = c(0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L, 0L, 30L, 60L, 90L, 120L, 150L, 180L, 
        210L, 240L, 270L, 300L), Site_ID = c("H1", "H1", "H1", "H1", 
        "H1", "H1", "H1", "H1", "H1", "H1", "H1", "H2", "H2", "H2", "H2", 
        "H2", "H2", "H2", "H2", "H2", "H2", "H2", "H3", "H3", "H3", "H3", 
        "H3", "H3", "H3", "H3", "H3", "H3", "H3"), Vol_mL = c(63, 62, 
        60, 59, 58, 56, 54, 52.5, 50, 48.5, 46.5, 82, 77, 73, 68, 65, 
        51, 56, 52, 47.5, 42.5, 37.5, 69, 67, 65, 63, 61, 60, 58, 56, 
        54, 51.5, 49), Sqrt_Time.x = c(0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808, 0, 5.477225575, 7.745966692, 9.486832981, 
        10.95445115, 12.24744871, 13.41640786, 14.49137675, 15.49193338, 
        16.43167673, 17.32050808), Cal_Vol_cm = c(0, 0.124339799, 0.373019398, 
        0.497359197, 0.621698996, 0.870378595, 1.119058194, 1.305567893, 
        1.616417391, 1.80292709, 2.051606688, 0, 0.621698996, 1.119058194, 
        1.74075719, 2.113776588, 3.854533778, 3.232834782, 3.730193979, 
        4.289723076, 4.911422072, 5.533121068, 0, 0.248679599, 0.497359197, 
        0.746038796, 0.994718394, 1.119058194, 1.367737792, 1.616417391, 
        1.865096989, 2.175946488, 2.486795986)), row.names = c(NA, 33L
        ), class = "data.frame")

Solution

  • I don't have your data, but here's an example of using broom to extract a coefficient from lm:

    library(tidyverse)
    library(broom)
    
    lm(mpg ~ wt + I(wt^2), data = mtcars) %>%
      tidy() %>%
      filter(term == "I(wt^2)") %>%
      pull(estimate)
    # [1] 1.171087
    

    EDIT, adding application to provided data:

    Or, with your Infil_Data2:

    lm(Cal_Vol_cm ~ poly(Sqrt_Time.x, 2, raw = TRUE), data = Infil_Data2) %>%
      tidy() %>%
      slice(3) %>%  # In this case, coefficient "C" is in third row
      pull(estimate)
    # [1] 0.01078006