Solved – Robust option in Stata: why are the p values computed using a Student distribution

heteroscedasticitystata

The commonly used "robust" option in the regress command of Stata gives standard errors using the Huber-White sandwich estimators.

The t statistic also uses these standard errors. However I have noticed that the p value computed along uses the Student distribution, exactly like in the "regular" OLS regression with the normality assumption.

I thought that the release of the homoscedasticity assumption implied that we had to rely on asymptotics properties of the OLS estimator, and as a consequence that the t statistic was to be compared to a standard normal distribution…

Did I miss something ?

Best Answer

Robust variance estimators require large samples to be valid. In small samples, they are biased downward, and the normal-distribution-based confidence intervals may have coverage way below nominal coverage rates.

Using a $t_{n-k}$-distribution approximations to be conservative is one possible solution: you hope that this fattens up the tails adequately before you offer up your tests to the journal referee gods. Other ideas are multiplying the squared residuals by $\frac{n}{n-k}$ (or something similar) to inflate them (which Stata also does), or higher order asymptotic expansions, or resampling methods like bootstrapping. Here $n$ is the number of observations and $k$ the number of parameters.

A nice survey of this literature is Imbens and Kolesar (2012). They give a great example where the $t$ approximation goes wrong in a setting where you have a binary treatment with very few treated observations. Using $n=n_T+n_C$ is far too generous.

If your sample size is large, using the $t$ versus the normal won't matter at all.

Related Solutions

Fixed Effects Model – Comparing Python’s Linearmodels PanelOLS with Stata’s xtreg, fe

If anybody has a similar question, I found my mistake. According to the linearmodels manual the robust SE used (White's robust covarinace) is not robust for fixed effects. Instead clustered covariance is required. Stata seems to adapt that automatically. Therefore the correct command is:

from linearmodels import PanelOLS
fe_mod = PanelOLS(df.y,sm2.add_constant(df[['x1' , 'x2']]), entity_effects=True)
fe_res = fe_mod.fit(cov_type='clustered', cluster_entity=True)
print(fe_res)

This leads to:

                              PanelOLS Estimation Summary                           
================================================================================
Dep. Variable:                      y   R-squared:                        0.0008
Estimator:                   PanelOLS   R-squared (Between):             -0.0212
No. Observations:               34338   R-squared (Within):               0.0008
Date:                Tue, May 05 2020   R-squared (Overall):           3.076e-05
Time:                        18:04:12   Log-likelihood                -4.647e+05
Cov. Estimator:             Clustered                                           
                                        F-statistic:                      13.569
Entities:                        1304   P-value                           0.0000
Avg Obs:                       26.333   Distribution:                 F(2,33805)
Min Obs:                       0.0000                                           
Max Obs:                       75.000   F-statistic (robust):             8.0007
                                        P-value                           0.0003
Time periods:                      88   Distribution:                 F(2,33805)
Avg Obs:                       390.20                                           
Min Obs:                       0.0000                                           
Max Obs:                       499.00                                           
                                                                                
                             Parameter Estimates                              
==============================================================================
            Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
------------------------------------------------------------------------------
const       5.472e+05  2.296e+04     23.840     0.0000   5.023e+05   5.922e+05
x1            -758.82     201.83    -3.7597     0.0002     -1154.4     -363.23
x2            -322.77     219.40    -1.4712     0.1413     -752.80      107.25
==============================================================================

F-test for Poolability: 1337.3
P-value: 0.0000
Distribution: F(530,33805)

Included effects: Entity

Which is close enough to the Stata version for me to trust the results.

Stata – Differences in Differences, Fixed Effects, and Standard Errors Explained

You need to compare apples to apples, so use clustering with OLS and clustering with xtreg, fe (or robust with xtreg, fe, which will default to clustering as Thomas pointed out). These coefficient equivalences are limited to two-period (one pre, and one post) datasets with treatment at the same time for all treated units.

Here's an example of 2x2 DID on a public dataset demonstrating this. Here NJ restaurants are treated (become subject to the minimum wage increase) and PA restaurant are not. February '92 (t=0) is pre and November '92 is post (t=1). The DID parameter is the interaction of t = 1 and NJ = 1. The outcome fte is full-time equivalent employees. Here I will balance the panel in order to get xtreg, fe and OLS to give the same coefficient estimates. If the panel is unbalanced (consists of repeated cross-sections), xtreg, fe will drop some observations that appear in only one year and the estimates will no longer match OLS or manual calculations. You may want to stick with clustered OLS if you have a repeated cross-section.

Here is the result. Note that you can use factor variable notation to create the interactions rather than hard coding them.

. use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta, clear
(Dataset from Card&Krueger (1994))

. drop if id == 407 // duplicate restaurant
(4 observations deleted)

. drop if missing(fte, treated, t, id)
(19 observations deleted)

. bysort id: keep if _N==2 // balance the panel
(19 observations deleted)

. xtset id t
       panel variable:  id (strongly balanced)
        time variable:  t, 0 to 1
                delta:  1 unit

. 
. /* calculate DID by hand */
. table treated t, c(mean fte N fte) row col

----------------------------------------
New       |
Jersey =  |
1;        |
Pennsylva | Feb. 1992 = 0; Nov. 1992 = 1
nia = 0   |        0         1     Total
----------+-----------------------------
       PA | 20.17333     17.65  18.91167
          |       75        75       150
          | 
       NJ | 17.06927  17.51831  17.29379
          |      314       314       628
          | 
    Total | 17.66774   17.5437  17.60572
          |      389       389       778
----------------------------------------

. di %9.3f (17.51831 - 17.06927) - (17.65 - 20.17333)
    2.972

. 
. /* fit models */
. eststo ols_robust:   qui   reg fte i.treated##i.t, robust

. eststo xtreg_robust: qui xtreg fte i.treated##i.t, fe robust

. eststo xtreg_clust:  qui xtreg fte i.treated##i.t, fe cluster(id)

. eststo ols_clust:    qui   reg fte i.treated##i.t, cluster(id)

. 
. capture ssc install estout

. esttab *, se(%9.7f) noomitted drop(0.treated 0.t 0.treated#0.t) modelwidt(15) mtitles label varwidth(35)

---------------------------------------------------------------------------------------------------------------
                                                (1)                (2)                (3)                (4)   
                                         ols_robust       xtreg_robust        xtreg_clust          ols_clust   
---------------------------------------------------------------------------------------------------------------
NJ                                           -3.104*                                                  -3.104*  
                                        (1.4475664)                                              (1.4484988)   

Feb. 1992 = 0; Nov. 1992 = 1=1               -2.523             -2.523*            -2.523*            -2.523*  
                                        (1.6371048)        (1.2498119)        (1.2498119)        (1.2506190)   

NJ # Feb. 1992 = 0; Nov. 1992 = 1=1           2.972              2.972*             2.972*             2.972*  
                                        (1.7822146)        (1.3337493)        (1.3337493)        (1.3346107)   

Constant                                      20.17***           17.67***           17.67***           20.17***
                                        (1.3591695)        (0.2232501)        (0.2232501)        (1.3600450)   
---------------------------------------------------------------------------------------------------------------
Observations                                    778                778                778                778   
---------------------------------------------------------------------------------------------------------------
Standard errors in parentheses
* p<0.05, ** p<0.01, *** p<0.001

Clustering in DID settings is a good idea for reasons outlined in Bertrand, Duflo, and Mullainathan's 2004 QJE paper. Clustering at the level of treatment is also a good idea, but here that is not feasible since there are not enough clusters (since treatment is a state law and we have data from two states only) for that to work well. Generally your SEs will go up when you cluster in DID, but if the errors are negatively correlated within cluster, they might shrink. See this post for the reasons why.

Code:

estimates clear
cls
use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta, clear
drop if id == 407 // duplicate restaurant
drop if missing(fte, treated, t, id)
bysort id: keep if _N==2 // balance the panel
xtset id t

/* calculate DID by hand */
table treated t, c(mean fte N fte) row col
di %9.3f (17.51831 - 17.06927) - (17.65 - 20.17333)

/* fit models */
eststo ols_robust:   qui   reg fte i.treated##i.t, robust
eststo xtreg_robust: qui xtreg fte i.treated##i.t, fe robust
eststo xtreg_clust:  qui xtreg fte i.treated##i.t, fe cluster(id)
eststo ols_clust:    qui   reg fte i.treated##i.t, cluster(id)

capture ssc install estout
esttab *, se(%9.7f) noomitted drop(0.treated 0.t 0.treated#0.t) modelwidt(15) mtitles label varwidth(35)

Best Answer

Related Solutions

Fixed Effects Model – Comparing Python’s Linearmodels PanelOLS with Stata’s xtreg, fe

Stata – Differences in Differences, Fixed Effects, and Standard Errors Explained

Related Question