Plotting a multiple logistic regression for binary and continuous values in R

I have a data frame of mammal genera. Each row of the column is a different genus. There are three columns: a column of each genus's geographic range size (a continuous variable), a column stating whether or not a genus is found inside or outside of river basins (a binary variable), and a column stating whether the genus is found in the fossil record (a binary variable).

I have performed a multiple logistic regression to see if geographic range size and presence in/out of basins is a predictor of presence in the fossil record using the following R code.

Regression<-glm(df[ ,"FossilRecord"] ~ log(df[ ,"Geographic Range"]) + df[  ,"Basin"], family="binomial")

I am trying to find a way to visually summarize the output of this regression (other than a table of the regression summary).

I know how to do this for a single variable regression. For example, I could use a plot like if I wanted to see the relationship between just geographic range size and presence in the fossil record.

However, I do not know how to make a similar or equivalent plot when there are two independent variables, and one of them is binary. What are some plotting and data visualization techniques I could use in this case?

Thanks for the help!

Solution

Visualization is important and yet it can be very hard. With your example, I would recommend plotting one line for predicted FossilRecord versus GeographicRange for each level of your categorical covariate (Basin). Here's an example of how to do it with the ggplot2 package

##generating data
ssize <- 100
set.seed(12345)
dat <- data.frame(
  Basin =  rbinom(ssize, 1,.4),
  GeographicRange = rnorm(ssize,10,2)
)
dat$FossilRecord = rbinom(ssize,1,(.3 + .1*dat$Basin + 0.04*dat$GeographicRange))

##fitting model
fit <- glm(FossilRecord ~ Basin + GeographicRange, family=binomial(), data=dat)

We can use the predict() function to obtain predicted response values for many GeographicRange values and for each Basin category.

##getting predicted response from model
plotting_dfm <- expand.grid(GeographicRange = seq(from=0, to = 20, by=0.1),
                           Basin = (0:1))
plotting_dfm$preds <- plogis( predict(fit , newdata=plotting_dfm))

Now you can plot the predicted results:

##plotting the predicted response on the two covariates
library(ggplot2)
pl <- ggplot(plotting_dfm, aes(x=GeographicRange, y =preds, color=as.factor(Basin)))
pl + 
  geom_point( ) +
  ggtitle("Predicted FossilRecord by GeoRange and Basin") + 
  ggplot2::ylab("Predicted FossilRecord")

This will produce a figure like this: