Search code examples
rplotlogistic-regressioninteractionsjplot

Interaction plots for continuous variables in logistic regression


I fitted a logistic regression model with a interaction between continuous predictors.

I used plot_model() function in sjPlot package in R to get this interaction plots and i couldnt figure out how this function has categorized volume in to 2 factors.

require(ISLR)
require(sjPlot)

m1=glm(Direction ~ Lag1 + Lag4* Volume ,data=Smarket,family ="binomial" )

plot_model(m1,type = "int",colors =rainbow(3))

enter image description here

0.36 and 3.15 corresponds to the minimum and maximum volume respectively.

Can anyone help me to interpret this plot ?

Also is there any other way to draw this interaction plots for logistic regression ?

Thank you


Solution

  • The interaction is between two continuous variables. The plot is using Lag4 as the x-axis variable and then picking a couple of values of Volume to show how the relationship between Direction and Lag4 varies for different values of Volume. By default, the minimum and maximum of Volume are chosen. You can instead show the median and quartiles of Volume or the mean and standard deviation of Volume by using the mdrt.values argument (see the help for additional options). For example:

    theme_set(theme_classic()) # Set ggplot theme
    
    plot_model(m1, type="int", colors=rainbow(3), mdrt.values="quart")
    plot_model(m1, type="int", colors=rainbow(3), mdrt.values="meansd")
    

    enter image description here

    Another option is a heatmap, which would allow you to plot the interaction variables on the x and y axes, and use colour to signify the probability of Direction equal to "Up". For example:

    # Create grid of Lag1 and Volume values for prediction
    pred.dat = expand.grid(Lag1 = median(Smarket$Lag1),
                           Lag4 = seq(min(Smarket$Lag4), max(Smarket$Lag4), length=100),
                           Volume = seq(min(Smarket$Volume), max(Smarket$Volume), length=100))
    
    # Add predictions
    pred.dat$Direction = predict(m1, newdata=pred.dat, type="response")
    
    # Plot heatmap
    ggplot(pred.dat, aes(Lag4, Volume, fill=Direction)) + 
      geom_tile() +
      scale_fill_gradient2(low="red", mid="white", high="blue", 
                           midpoint=median(pred.dat$Direction)) +
      labs(title='Probability of Direction="Up"',
           fill="Probability")
    

    The plots above represent lines of constant Volume across the heatmap below. For example, when Volume is 1.12 (red line in left-hand plot above) you can in the heatmap below see that the colour goes from blue to white to red, signifying decreasing probability of Direction="Up" as Lag4 increases, just as we see in the plot above.

    enter image description here