Search code examples
statalogistic-regressionmultinomial

Stata multinomial regression - post-estimation Wald test


I've conducted a multinomial logistic regression analysis in Stata, followed by a Wald test, and was hoping someone could confirm that my code is doing what I think it's doing.

NB: I'm using some of Stata's example data to illustrate. The analysis I'm running for this illustration is completely meaningless, but uses the same procedure as my 'real' analysis, other than the fact that my real analysis also includes some probability weights and other covariates.

sysuse auto.dta

First, I run a multinomial logistic regression, predicting 'Repair Record' from 'Foreign' and 'Price':

mlogit rep78 i.foreign price, base(1) rrr nolog

Multinomial logistic regression                 Number of obs     =         69
                                                LR chi2(8)        =      31.15
                                                Prob > chi2       =     0.0001
Log likelihood = -78.116372                     Pseudo R2         =     0.1662

------------------------------------------------------------------------------
       rep78 |        RRR   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
1            |  (base outcome)
-------------+----------------------------------------------------------------
2            |
     foreign |
    Foreign  |   .7822853   1672.371    -0.00   1.000            0           .
       price |   1.000414   .0007027     0.59   0.556     .9990375    1.001792
       _cons |   .5000195   1.669979    -0.21   0.836      .000718    348.2204
-------------+----------------------------------------------------------------
3            |
     foreign |
    Foreign  |     686842   1.30e+09     0.01   0.994            0           .
       price |   1.000462   .0006955     0.66   0.507     .9990996    1.001826
       _cons |   1.254303   4.106511     0.07   0.945     .0020494    767.6863
-------------+----------------------------------------------------------------
4            |
     foreign |
    Foreign  |    6177800   1.17e+10     0.01   0.993            0           .
       price |   1.000421   .0006999     0.60   0.547     .9990504    1.001794
       _cons |   .5379627     1.7848    -0.19   0.852     .0008067    358.7452
-------------+----------------------------------------------------------------
5            |
     foreign |
    Foreign  |   2.79e+07   5.29e+10     0.01   0.993            0           .
       price |   1.000386   .0007125     0.54   0.587     .9989911    1.001784
       _cons |    .146745   .5072292    -0.56   0.579     .0001676    128.4611
------------------------------------------------------------------------------

Second, I want to know whether the 'Foreign' coefficient for outcome category 4 is significantly different to the 'Foreign' coefficient for outcome category 5. So, I run a Wald test:

test [4]1.foreign = [5]1.foreign

 ( 1)  [4]1.foreign - [5]1.foreign = 0

           chi2(  1) =    2.72
         Prob > chi2 =    0.0988

From this, I conclude that the 'Foreign' coefficient for outcome category 4 is NOT significantly different to the 'Foreign' coefficient for outcome category 5. Put more simply, the association between 'Foreign' and 'Repair 4' (compared to 'Repair 1') is equal to the association between 'Foreign' and 'Repair 5' (compared to 'Repair 1') .

Is my code for the Wald test, and my inferences about what it's doing and showing, correct?


Solution

  • Additionally, to what was discussed in the comments you can also perform a likelihood-ratio test using the following code.

    sysuse auto.dta
    
    qui mlogit rep78 i.foreign price, base(1) rrr nolog 
    estimate store unrestricted
    
    constraint 1 [4]1.foreign = [5]1.foreign
    
    qui mlogit rep78 i.foreign price, base(1) rrr nolog constraints(1)
    estimate store restricted
    
    lrtest unrestricted restricted
    

    The output of the test shows the same conclusion as the Wald test, but it has better properties as explained below.

    Likelihood-ratio test                                 LR chi2(1)  =      3.13
    (Assumption: restricted nested in unrestricted)       Prob > chi2 =    0.0771
    

    Quoting the official documentation from mlogit

    The results produced by test are an approximation based on the estimated covariance matrix of the coefficients. Because the probability of being uninsured is low, the log-likelihood may be nonlinear for the uninsured. Conventional statistical wisdom is not to trust the asymptotic answer under these circumstances but to perform a likelihood-ratio test instead.