I am trying to get a graph which shows a continous variable on the X-axis. And a probability on the Y-axis. I can do this easily for my data if I take my glm model, and then use plot(allEffects(glm))
.
This gives me:
However, this plot looks rather bad (I have to format it as though it was a paper).
I wish to use ggplot in order to prettify this plot a bit, but I don't know how to do this.
I know I can extract some of the effect value with allEffects(glm)
, but I am unsure how to then use this properly.
I tried using these extracted effects in the ggplot function, but as is to be expected, this does not work as the data is not formatted properly.
My question: How do I use the allEffects function in tandem with ggplot. Or alternatively, how do I show a probability on the Y-axis using ggplot
You don't need to use allEffects
at all. You can plot a glm
directly in ggplot
.
There was no sample data in the question, so let's make some up:
set.seed(69)
x <- sample(1:100, 100, prob = (100:1), replace = TRUE)
y <- rbinom(200, 1, (100 - x)/100)
df <- data.frame(x, y)
model1 <- glm(y ~ x, data = df, family = binomial)
Now model1
is a logistic regression model, and of the many ways we could choose to plot this, we could choose to plot it with allEffects
, which is certainly easy enough:
plot(allEffects(model1))
However, to gain more control of the look of the plot, we can display an equivalent plot in ggplot using geom_smooth
. Here, I have chosen some black & white publication - type settings, but clearly you will want to change these based on your tastes and the journal's requirements:
ggplot(df, aes(x, y)) +
geom_smooth(method = "glm", formula = y ~ x, colour = "black",
linetype = 2, fill = "gray80", alpha = 0.2,
method.args = list(family = binomial)) +
geom_rug(sides = "b") +
theme_bw() +
labs(y = "Probability", x = "Continuous variable",
title = "Logistic regression model") +
theme(text = element_text(size = 16),
plot.margin = margin(50, 50, 50, 50),
axis.title.x = element_text(vjust = -8),
axis.title.y = element_text(vjust = 10),
plot.title = element_text(vjust = 8))
You will notice that the line isn't straight - that's because we have a fixed probability axis, which I think is a better way of presenting such a model.
Created on 2020-07-15 by the reprex package (v0.3.0)