Search code examples
rggplot2glmpredict

How can I ggplot a logistic function correctly using predict or inv.logit?


I have some observations that I have used to determine death rates based various concentrations of a chemical. I have weighted these rates based on the number of observations underlying them, and fit them to a glm (binomial(link=logit)) model. I have been unsuccessfully trying to display a plot of this model in ggplot including original observations (size = weight), model fitting line, and confidence interval, without luck. I can get a simple plot() to work, but then I can't display the other graphics I need. Any ideas? Thanks in advance!!!

#data:
C <- data.frame("region" = c("r29","r31","r2325","r25","r2526", "r26"),"conc" = c(755.3189,1689.6680,1781.8450,1902.8830,2052.1133,4248.7832),"nr_dead" = c(1,1,18,44,170,27), "nr_survived" = c(2,3,29,1370,1910,107),"death_rate" = c(0.33333333,0.25000000,0.38297872,0.03111740,0.08173077
,0.20149254))
C$tot_obsv <- (C$nr_survived+C$nr_dead)
#glm model:
C_glm <- glm(cbind(nr_dead, nr_survived) ~ conc, data = C, family = "binomial")
#ggplot line is incorrect:
ggplot(C_glm, aes(C$conc,C$death_rate, size = C$tot_obsv)) + coord_cartesian(ylim = c(0, 0.5)) + theme_bw() + geom_point() + geom_smooth(method = "glm", mapping = aes(weight = C$tot_obsv))

#correct plot of inv.logit = logistic function (1/(1+exp(-x)))
plot(inv.logit(-3.797+0.0005751*(0:6700)))

#using predict function works, but doesn't display confidence interval or nice point sizes:
x_conc <-seq (750, 6700, 1)
y_death_rate <- predict.glm(C_glm, list(conc=x_conc), type="response")
plot(C$conc, C$death_rate, pch = 10, lwd = 3, cex = C$tot_obsv/300, ylim = c(0, 0.5), xlim = c(0,7000), xlab = "conc", ylab = "death rate")
lines(x_conc, y_death_rate, col = "red", lwd = 2)

Basically, I am trying to plot the glm predicted logistic curve, observation weights, and confidence interval using ggplot, but can only get the curve to display correctly using plot().


Solution

  • building up on @IceCreamToucan's answer

    tibble(
      x_conc = c(seq(750, 6700, 1), C$conc), 
      y_death_rate = predict.glm(C_glm, list(conc = x_conc), type = "response")
      ) %>% 
      left_join(C, by = c('x_conc' = 'conc')) %>% 
      ggplot(aes(x = x_conc, y = y_death_rate)) +
        #geom_line(aes(size = 0.8)) + commented out as binomial smooth does this
        geom_point(aes(y = death_rate, size = tot_obsv)) + binomial_smooth()
    

    of course we will need to define the function binomial_smooth this is taken from:https://ggplot2.tidyverse.org/reference/geom_smooth.html

    binomial_smooth <- function(...) {
        geom_smooth(method = "glm", method.args = list(family = "binomial"), ...)
    }
    

    plot with smoothing based on glm and conf. interval shading