Search code examples
statalogistic-regressionmultinomial

Marginal effects from random effects multinomial logit with Stata


I have data in which participants pick one of three options for a number of related questions. For one part of the analysis, I have combined all of these answers, so that I have a number of observations from each participant. The way I have modeled this is with a multinomial logit with the participant ID as a random effect. I have then estimated the model using gllamm.

Where I've now been stuck for a while is that I cannot seem to extract marginal effects from this regression. My searches so far suggest that the way to do it involves gllapred, mu marg. Running that, however, seems to return the overall probability of picking a particular choice in my sample. Rather, I'd like to find out how a change in one of my dummy variables (e.g. male) changes the probability of making that particular choice.

Assuming there is no way to get anything like the output of margins for this type of object, is there a way I can get the marginal effects manually? That is, can I estimate at male = 0, male = 1, then take the difference? My variables of interest are dummies, but I do have a continuous variable (age) that I presumably could not estimate like this -- however, I'm also not all that interested in the marginal effect of that.


Solution

  • I am also learning about gllamm and finally got around to making this small example that hopefully helps the discussion (even though it is too late for the OP's original need).

    The example does three things:

    1. uses results from e(b) to calculate the marginal effect of a dichotomous variable (on the odds ration; some tweak needed for probabilities;
    2. demonstrates results from the gllapred varname, mu margapproach, which is not desired for our interest;
    3. compares the expanded method that estimates separate (and correlated) random effects for different responses and the "naive" method that produces only one random intercept.

    A useful reference for this type of models is section 9.3 from an earlier manual of gllamm.

    First some fake data. Hope this fits the OP's description.

    clear
    set more off
    input id q1 q2 q3 q4 age female
    1 3 1 2 3 10 1
    2 3 1 2 3 12 0 
    3 3 3 1 1 11 0
    4 1 1 2 3 9  1
    5 1 3 1 1 10 1
    6 2 3 1 2 11 1
    7 2 1 1 3 11 1
    8 2 1 3 3 11 1
    9 1 2 3 1 11 1
    10 1 2 3 1 11 1
    11 2 1 3 2 12 0
    12 3 1 1 2 12 1
    13 2 1 2 3 12 0
    14 2 1 1 1 12 0
    15 3 2 1 1 12 0
    end
    reshape long q, i(id) j(item)
    

    This is a "naive" model that only produces one random effect variance despite separate estimated for the outcome of response=2 (versus 1), and response=3 (versus 1).

    gllamm q age female,i(id) link(mlogit) family(binomial) base(1)
    

    As the OP pointed out, the gllapred varname, mu marg approach basically predicts the probabilities for every individual on each response. Despite the similar name, this is different from Stata's margins command.

    // The -gllapred- approach
    gllapred x1,outcome(1) mu marginal
    gllapred x2,outcome(2) mu marginal
    gllapred x3,outcome(3) mu marginal
    sort id item
    list id item q x1 x2 x3 // same probability for the same individual; x1+x2+x3=1
    

    But we can manually estimate the marginal effect of female when other covariates are fixed using results from e(b). First, we find out the mean of age.

    // fix age at mean
    su age, meanonly // average age
    loca mean_age=r(mean)
    

    Then we subtract the two predicted odds ratios manually for each response.

    // response: c2
    loca c2m=exp(_coef[c2:age]*`mean_age'+_coef[c2:_cons])
    loca c2f=exp(_coef[c2:age]*`mean_age'+_coef[c2:female]+_coef[c2:_cons])
    loca diffc2=`c2f'-`c2m'
    di "c2[OR_female - OR_male]=>" %5.3g `c2f' " -" %5.3g `c2m' " =" %5.3g `diffc2'  // "OR" for odds ratio
    
    // response: c3
    loca c3m=exp(_coef[c3:age]*`mean_age'+_coef[c3:_cons])
    loca c3f=exp(_coef[c3:age]*`mean_age'+_coef[c3:female]+_coef[c3:_cons])
    loca diffc3=`cf'-`c3m'
    di "c3[OR_female - OR_male]=>" %5.3g `c3f' " -" %5.3g `c3m' " =" %5.3g `diffc3'  // "OR" for odds ratio
    

    Next, we apply the expanded model that produces the same fixed effect estimates but two variances for the random effects plus their covariance.

    sort id item
    gen patt=_n
    expand 3                            // triple number of cases
    sort patt
    rename q response                   // just to match help file
    by patt, sort: gen alt=_n           // create all three potential answers
    gen chosen=response==alt            // mark the case with the chosen answer
    qui tab alt, gen(it)
    eq i2: it2
    eq i3: it3
    
    gllamm alt age female,i(id) nrf(2) eqs(i2 i3) nip(4) expanded(patt chosen m) /*
    */ link(mlogit) family(binomial) trace // compare the random effects to the "naive" model
    

    The fixed effects are the same as the last model, but note that there are now two more random effect parameters. Code in the last model can be used to calculate the marginal effect of female.

    // fix age at mean
    su age, meanonly
    loca mean_age=r(mean)
    
    // c2
    loca c2m=exp(_coef[c2:age]*`mean_age' + _coef[c2:_cons])
    loca c2f=exp(_coef[c2:age]*`mean_age' + _coef[c2:female]+_coef[c2:_cons])
    loca diffc2=`c2f'-`c2m'
    di "c2[OR_female - OR_male]=>" %5.3g `c2f' " -" %5.3g `c2m' " =" %5.3g `diffc2'  // "OR"=>odds ratio
    
    // c3
    loca c3m=exp(_coef[c3:age]*`mean_age' + _coef[c3:_cons])
    loca c3f=exp(_coef[c3:age]*`mean_age' + _coef[c3:female]+_coef[c3:_cons])
    loca diffc3=`cf'-`c3m'
    di "c3[OR_female - OR_male]=>" %5.3g `c3f' " -" %5.3g `c3m' " =" %5.3g `diffc3'  // "OR"=>odds ratio
    

    Hopefully I am getting this right. Feel free to correct if any mistake was committed.