Search code examples
statacoefplot

How can I adjust a coefplot for the constant value of categorical variable estimation?


I have a dataset in Stata that looks something like this

    Variable |        Obs        Mean    Std. dev.       Min        Max
-------------+---------------------------------------------------------
         dv2 |      1,904    .5395645     .427109  -1.034977   1.071396
        xvar |      1,904    3.074055    1.387308          1          5

with xvar being a categorical independent variable and dv2 a dependent variable of interest.

I am estimating a simple model with the categorical variable as a dummy:

 reg dv2 ib4.xvar
eststo myest 

      Source |       SS           df       MS      Number of obs   =     1,904
-------------+----------------------------------   F(4, 1899)      =     13.51
       Model |  9.60846364         4  2.40211591   Prob > F        =    0.0000
    Residual |  337.540713     1,899  .177746558   R-squared       =    0.0277
-------------+----------------------------------   Adj R-squared   =    0.0256
       Total |  347.149177     1,903  .182422058   Root MSE        =     .4216

------------------------------------------------------------------------------
         dv2 | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        xvar |
          A  |    .015635   .0307356     0.51   0.611     -.044644     .075914
          B  |   .1435987    .029325     4.90   0.000     .0860861    .2011113
          C  |   .1711176   .0299331     5.72   0.000     .1124124    .2298228
          E  |   .1337754   .0295877     4.52   0.000     .0757477    .1918032
             |
       _cons |    .447794    .020191    22.18   0.000     .4081952    .4873928
------------------------------------------------------------------------------

These are the results. As you can see B, C and E have larger effect than D which is the excluded category.

However, coefplot does not account for the in categorical variable the coefficient is composite true_A=D+A.

coefplot myest, scheme(s1color) vert

enter image description here

As you can see the plot shows the constant to be the largest coefficient, while the other to be smaller.

Is there a systematic way I can adjust for this problem and plot the true coefficients and SEs of each category?

Thanks a lot for your help


Solution

  • In response to your second comment, here is an example of how you can use marginsplot to plot estimated effects from a linear regression.

    sysuse auto, clear
    replace price = price/100
    reg price i.rep78, cformat(%9.2f)
    
    ------------------------------------------------------------------------------
           price | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           rep78 |
              2  |      14.03      23.56     0.60   0.554       -33.04       61.10
              3  |      18.65      21.76     0.86   0.395       -24.83       62.13
              4  |      15.07      22.21     0.68   0.500       -29.31       59.45
              5  |      13.48      22.91     0.59   0.558       -32.28       59.25
                 |
           _cons |      45.65      21.07     2.17   0.034         3.55       87.74
    ------------------------------------------------------------------------------
    
    margins i.rep78, cformat(%9.2f)
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   std. err.      t    P>|t|     [95% conf. interval]
    -------------+----------------------------------------------------------------
           rep78 |
              1  |      45.65      21.07     2.17   0.034         3.55       87.74
              2  |      59.68      10.54     5.66   0.000        38.63       80.73
              3  |      64.29       5.44    11.82   0.000        53.42       75.16
              4  |      60.72       7.02     8.64   0.000        46.68       74.75
              5  |      59.13       8.99     6.58   0.000        41.18       77.08
    ------------------------------------------------------------------------------
    
    marginsplot
    

    Note that these values are the constant plus the appropriate coefficient.

    And then using the marginsplot command we can produce the following plot, which includes the marginal estimates and confidence intervals:

    enter image description here