Search code examples
rggplot2logistic-regression

geom_abline for logistic regression (ggplot2)


I am sorry if this question is very simple, however, I could not find any solution to my problem. I want to plot logistic regressions lines with ggplot2. The problem is that I cannot use geom_abline because I dont have the original model, just the slope and intercept for each regression line. I have use this approach for linear regressions, and this works fine with geom_abline, because you can just give multiple slopes and intercepts to the function.

geom_abline(data = estimates, aes(intercept = inter, slope = slo)

where inter and slo are vectors with more then one value.

If I try the same approach with coefficients from a logistic regression, I will get the wrong regression lines (linear). I am trying to use geom_line, however, I cannot use the function predict to generate the predicted values because I dont have the a original model objetc.

Any suggestion?

Thanks in advance, Gustavo


Solution

  • If the model had a logit link then you could plot the prediction using only the intercept (coefs[1]) and slope (coefs[2]) as:

    library(ggplot2)
    
    n <- 100L
    x <- rnorm(n, 2.0, 0.5)
    y <- factor(rbinom(n, 1L, plogis(-0.6 + 1.0 * x)))
    
    mod <- glm(y ~ x, binomial("logit"))
    coefs <- coef(mod)
    
    x_plot <- seq(-5.0, 5.0, by = 0.1)
    y_plot <- plogis(coefs[1] + coefs[2] * x_plot)
    
    plot_data <- data.frame(x_plot, y_plot)
    
    ggplot(plot_data) + geom_line(aes(x_plot, y_plot), col = "red") + 
            xlab("x") + ylab("p(y | x)") +
            scale_y_continuous(limits = c(0, 1)) + theme_bw()
    

    plot 1

    Edit

    Here one way of plotting k predicted probability lines on the same graph following from the previous code:

    library(reshape2)
    
    k <- 5L
    
    intercepts <- rnorm(k, coefs[1], 0.5)
    slopes <- rnorm(k, coefs[2], 0.5)
    
    x_plot <- seq(-5.0, 5.0, by = 0.1)
    model_predictions <- sapply(1:k, function(idx) {
                plogis(intercepts[idx] + slopes[idx] * x_plot)
            })
    
    colnames(model_predictions) <- 1:k
    plot_data <- as.data.frame(cbind(x_plot, model_predictions))
    plot_data_melted <- melt(plot_data, id.vars = "x_plot", variable.name = "model", 
            value.name = "y_plot")
    
    ggplot(plot_data_melted) + geom_line(aes(x_plot, y_plot, col = model)) + 
            xlab("x") + ylab("p(y | x)") +
            scale_y_continuous(limits = c(0, 1)) + theme_bw()
    

    plot 2