Search code examples
rr-marginaleffects

Incorporate AME into glm() loop of logit models


I'm running a bunch of identical logistic regression models across a variety of different treatment conditions from an experiment, and I have a handful of outcomes, so I've written a simple bit of code that loops through those and creates a dataframe of results at the end.

I think my question is fairly straightforward so I'll be succinct: what is the best way to subsequently calculate average marginal effects (AME) and CIs using my same framework? Of course, if there's some completely alternative way of doing this that requires a change to my strategy, that's fine too. Ideally, at the end, my goal is to have one big dataframe with the logit coefficients and CIs alongside the AMEs and CIs.

library(tidyverse)
library(broom)

# Simulate some data
set.seed(123)  
n_obs <- 500
d <- data.frame(
  white = sample(c(0, 1), n_obs, replace = TRUE),              # binary indicator
  gender = sample(c("male", "female"), n_obs, replace = TRUE),   # categorical variable
  funding_health = rnorm(n_obs, mean = 50, sd = 10),             # continuous predictor
  treat_condition = sample(c("A", "B"), n_obs, replace = TRUE)     # treatment condition
)

logit1 <- -1 + 0.5 * d$white + 0.3 * (d$gender == "female") + 0.02 * d$funding_health
prob1 <- 1 / (1 + exp(-logit1))
d$outcome1 <- rbinom(n_obs, size = 1, prob = prob1)

logit2 <- -0.5 + 0.2 * d$white - 0.4 * (d$gender == "female") + 0.03 * d$funding_health
prob2 <- 1 / (1 + exp(-logit2))
d$outcome2 <- rbinom(n_obs, size = 1, prob = prob2)

outcomes <- list(outcome1 = "Outcome 1", outcome2 = "Outcome 2")
treat_conditions_list <- c("A", "B")

# Models
model_results <- expand.grid(outcome_var = names(outcomes), condition = treat_conditions_list) %>%
  pmap_dfr(function(outcome_var, condition) {
    mod <- glm(as.formula(paste(outcome_var, 
                                "~ factor(white) + factor(gender) + funding_health")),  
               data = d,
               family = binomial,
               subset = treat_condition == condition,
               na.action = na.exclude)
    n <- nobs(mod)  # Number of observations used in the model
    tidy(mod, conf.int = TRUE) %>%
      mutate(outcome = outcomes[[outcome_var]],
             treat = condition,
             N = n)
  })


Solution

  • Main website for marginaleffects documentation: https://marginaleffects.com/

    Specific page on (average) slopes: https://marginaleffects.com/chapters/slopes.html

    Load the package:

    library(marginaleffects)
    

    Replace this line:

        tidy(mod, conf.int = TRUE) %>%
    

    By this line:

        avg_slopes(mod) %>%
    

    And you get these results:

    
               Term      Contrast  Estimate Std. Error      z Pr(>|z|)    S    CI low  CI high
     funding_health dY/dX          0.008961    0.00275  3.254  0.00114  9.8  0.003564  0.01436
     gender         male - female -0.087811    0.05865 -1.497  0.13433  2.9 -0.202759  0.02714
     white          1 - 0          0.290830    0.05852  4.970  < 0.001 20.5  0.176138  0.40552
     funding_health dY/dX          0.003978    0.00288  1.382  0.16686  2.6 -0.001662  0.00962
     gender         male - female  0.052246    0.05848  0.893  0.37165  1.4 -0.062374  0.16687
     white          1 - 0         -0.036878    0.05900 -0.625  0.53191  0.9 -0.152509  0.07875
     funding_health dY/dX         -0.000535    0.00305 -0.175  0.86084  0.2 -0.006516  0.00545
     gender         male - female -0.195410    0.06136 -3.184  0.00145  9.4 -0.315682 -0.07514
     white          1 - 0          0.023226    0.06125  0.379  0.70454  0.5 -0.096824  0.14328
     funding_health dY/dX          0.005933    0.00283  2.096  0.03604  4.8  0.000386  0.01148
     gender         male - female  0.085773    0.05744  1.493  0.13536  2.9 -0.026805  0.19835
     white          1 - 0          0.006688    0.05757  0.116  0.90751  0.1 -0.106140  0.11952
    
    

    Note that the results are "pretty-printed", but that you can extract columns normally, as with any other data frame:

    colnames(model_results)
     [1] "term"      "contrast"  "estimate"  "std.error" "statistic" "p.value"   "s.value"   "conf.low"  "conf.high" "outcome"   "treat"     "N"