Search code examples
rggplot2

Using ggplot and alleffects in tandem


I am trying to get a graph which shows a continous variable on the X-axis. And a probability on the Y-axis. I can do this easily for my data if I take my glm model, and then use plot(allEffects(glm)). This gives me: enter image description here However, this plot looks rather bad (I have to format it as though it was a paper).
I wish to use ggplot in order to prettify this plot a bit, but I don't know how to do this.
I know I can extract some of the effect value with allEffects(glm), but I am unsure how to then use this properly.
I tried using these extracted effects in the ggplot function, but as is to be expected, this does not work as the data is not formatted properly.
My question: How do I use the allEffects function in tandem with ggplot. Or alternatively, how do I show a probability on the Y-axis using ggplot


Solution

  • You don't need to use allEffects at all. You can plot a glm directly in ggplot.

    There was no sample data in the question, so let's make some up:

    set.seed(69)
    
    x  <- sample(1:100, 100, prob = (100:1), replace = TRUE)
    y  <- rbinom(200, 1, (100 - x)/100)
    
    df <- data.frame(x, y)
    
    model1 <- glm(y ~ x, data = df, family = binomial)
    

    Now model1 is a logistic regression model, and of the many ways we could choose to plot this, we could choose to plot it with allEffects, which is certainly easy enough:

    plot(allEffects(model1))
    

    However, to gain more control of the look of the plot, we can display an equivalent plot in ggplot using geom_smooth. Here, I have chosen some black & white publication - type settings, but clearly you will want to change these based on your tastes and the journal's requirements:

    ggplot(df, aes(x, y)) + 
      geom_smooth(method = "glm", formula = y ~ x, colour = "black",
                  linetype = 2, fill = "gray80", alpha = 0.2,
                  method.args = list(family = binomial)) +
      geom_rug(sides = "b") +
      theme_bw() +
      labs(y = "Probability", x = "Continuous variable", 
           title = "Logistic regression model") +
      theme(text = element_text(size = 16),
            plot.margin = margin(50, 50, 50, 50),
            axis.title.x = element_text(vjust = -8),
            axis.title.y = element_text(vjust = 10),
            plot.title = element_text(vjust = 8))
    

    You will notice that the line isn't straight - that's because we have a fixed probability axis, which I think is a better way of presenting such a model.

    Created on 2020-07-15 by the reprex package (v0.3.0)