Solved – Wrong confidence intervals using `margins` in Stata

stata

I am using Stata 13.1 to fit a logistic model and I am getting confidence intervals below 0 and above 1 when I predict probabilities using the margins command.

MRE:

sysuse auto, clear

* simple logistic regression
logit foreign mpg

* get predicted probabilities
margins, at(mpg=(5(5)40)) predict(pr)

* same result with expression
margins, at(mpg=(5(5)40)) exp(invlogit(predict(xb)))

The output is:

------------------------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         _at |
          1  |   .0271183   .0252542     1.07   0.283    -.0223792    .0766157
          2  |   .0583461   .0389888     1.50   0.135    -.0180704    .1347627
          3  |   .1210596   .0509373     2.38   0.017     .0212244    .2208948
          4  |   .2344013   .0547344     4.28   0.000      .127124    .3416787
          5  |   .4049667   .0743318     5.45   0.000      .259279    .5506543
          6  |   .6020462   .1162814     5.18   0.000     .3741387    .8299536
          7  |   .7707955   .1266899     6.08   0.000     .5224879    1.019103
          8  |   .8820117   .1004224     8.78   0.000     .6851874    1.078836

I'm trying to figure out what is Stata doing so I extracted the results with:

* save table as matrix
mat pr = r(table)
mat p = pr' //transpose matrix

Then I predicted the log odds (linear prediction), saved the results and transformed to probability (inverse logit)

* predict log odds
margins, at(mpg=(5(5)40)) exp(predict(xb))

* save table as matrix
mat tab = r(table)
mat t = tab' 

clear
svmat t

* transform logit to probability and display
gen prob   = invlogit(t1)
gen problo = invlogit(t5)
gen probhi = invlogit(t6)

format %9.3f prob*
list prob* in 1/8, noobs clean // correct confidence intervals?!

 prob   problo   probhi  
0.027    0.004    0.154  
0.058    0.015    0.199  
0.121    0.051    0.260  
0.234    0.144    0.358  
0.405    0.271    0.555  
0.602    0.369    0.797  
0.771    0.452    0.932  
0.882    0.530    0.980  

Now if I take the results from the margins output in probability scale predict(pr) and (wrongly) use the SE from that output to produce CIs I get the same as Stata:

clear

* convert results to data
svmat p

* generate confidence intervals (the wrong way)
gen problo = p1 - (1.96*p2)
gen probhi = p1 + (1.96*p2)

format %9.3f p*
list p5 problo p6 probhi in 1/8, noobs clean

    p5   problo      p6   probhi  
-0.022   -0.022   0.077    0.077  
-0.018   -0.018   0.135    0.135  
 0.021    0.021   0.221    0.221  
 0.127    0.127   0.342    0.342  
 0.259    0.259   0.551    0.551  
 0.374    0.374   0.830    0.830  
 0.522    0.522   1.019    1.019  
 0.685    0.685   1.079    1.079  

Sorry it took me so long but here is the question:
Is there a way of getting the right CIs using the margins command directly? and does this problem happens with other GLMs?

Thanks

Best Answer

margins computes standard errors from nonlinear predictions using the delta-method, and as donlelek points out, it also uses a normal approximation for computing confidence intervals.

Consider a slight variation on donlelek's example.

sysuse auto, clear
logistic foreign mpg
margins, predict(pr) nopvalues

The result from margins is

Predictive margins                              Number of obs     =         74
Model VCE    : OIM

Expression   : Pr(foreign), predict()

--------------------------------------------------------------
             |            Delta-method
             |     Margin   Std. Err.     [95% Conf. Interval]
-------------+------------------------------------------------
       _cons |   .2972973   .0487662      .2017172    .3928773
--------------------------------------------------------------

Let's call this marginal prediction $m$.

Here we verify by hand the confidence limits produced by margins.

. display .2972973 - invnormal(.975)*.0487662
.2017173

. display .2972973 + invnormal(.975)*.0487662
.3928773

$m$ is the mean of the predicted probabilities. The predicted probability for observation $i$ is

$$ p_i = \frac{1}{1 + \exp(-z_i)} $$

where $z_i$ is the corresponding linear prediction

$$ z_i = b_0 + b_1*x_i $$

$x_i$ is the corresponding value of mpg, and $b$ is the vector of regression coefficients. Thus $m$ is

$$ m = \frac{1}{N}\sum_{i=1}^N p_i $$

$m$ is not a simple transformation of the marginal linear prediction. This is the general situation that margins was developed to handle. As such, the only method for computing a confidence interval for $m$ is the normal approximation using standard errors computed via the delta-method.

margins does not provide an option that will transform the confidence limits of a marginal linear prediction.

However, donlelek's question inspired me to write transform_margins, which will transform the point estimates and confidence limits from margins results with a user specified expression. Here is donlelek's example using transform_margins:

sysuse auto
logistic foreign mpg
margins, at(mpg=(5(5)40)) predict(xb)
transform_margins invlogit(@)

Here is the output from transform_margins

----------------------------------------------
             |         b         ll         ul
-------------+--------------------------------
         _at |
          1  |  .0271183   .0042517    .153952
          2  |  .0583461   .0151856   .1993467
          3  |  .1210596   .0511398   .2603462
          4  |  .2344013    .144129   .3575909
          5  |  .4049667   .2710298   .5547246
          6  |  .6020462   .3688264   .7966118
          7  |  .7707955   .4519781     .93203
          8  |  .8820117   .5300384   .9802168
----------------------------------------------

Type the following commands in Stata to install transform_margins.

net from http://www.stata.com/users/jpitblado
net describe transform_margins
net install transform_margins
help transform_margins
Related Question