Solved – Single difference vs double difference model

difference-in-differenceeconometricsregressiontreatment-effect

I would like to understand how one gets the same coefficient estimate from 2 different model specifications.

Consider single difference estimation model:

$y_{\{Time=1\}}=\alpha+\beta_3 \textbf{Treatment}+\beta_4 y_{\{Time=0\}}+\epsilon, $

where $time:\{0,1\}$ or simply before/after and $\textbf{Treatment}:\{0,1\}$ or simply control/treatment groups.

Now consider double difference estimation model:

$y=\alpha+\beta_1 \textbf{Treatment}+\beta_2 \textbf{Time}+ \beta_3 (\textbf{Treatment}*\textbf{Time}) +\epsilon. $

The source, which I am questioning, claims that one can estimate $ \beta_3$ coefficient using either of above-mentioned models. However when I do simple rearrangement of terms and writing the model while changing group or time I find the following:

Well-known double difference estimator using DID model is the following (suppressing expected values):

$\beta_3=\Delta y_{\{Time=1\}}-\Delta y_{\{Time=0\}}$,

where $\Delta$ is the difference in treatment and control groups.

When I use the single difference model, I get the following for $\beta_3$:

$\beta_3=\Delta y_{\{Time=1\}}-\beta_4 \Delta y_{\{Time=0\}}$,

which shows that unless we put contstraint that $\beta_4=1$, I can not estimate the treatment effect using single difference estimator.

Question

Do I calculate wrongly or miss something? Could someone confirm that both models can result in the same estimate of $\beta_3$ ?

Best Answer

You are correct that the ANCOVA estimator and the DID do not estimate the same parameter. ANCOVA estimates $$(\bar Y^T_{POST}−\bar Y^C_{POST}) − \hat \theta \cdot (\bar Y^T_{PRE} - \bar Y^C_{PRE}),$$ where $\hat \theta$ is the coefficient on the lagged outcome, while DID is $$(\bar Y^T_{POST}−\bar Y^T_{PRE}) − (\bar Y^C_{POST} - \bar Y^C_{PRE})$$

These formulas are given in McKenzie (2012).

You can verify this with yourself with a regression:

. use http://fmwww.bc.edu/repec/bocode/c/CardKrueger1994.dta, clear
(Dataset from Card&Krueger (1994))

. /* fix sample */
. drop if id == 407 // duplicate restaurant
(4 observations deleted)

. xtset id t
       panel variable:  id (strongly balanced)
        time variable:  t, 0 to 1
                delta:  1 unit

. drop if missing(fte)
(19 observations deleted)

. bysort id: keep if _N==2
(19 observations deleted)

. reg fte i.treated##i.t, cluster(id) // DID

Linear regression                               Number of obs     =        778
                                                F(3, 388)         =       1.88
                                                Prob > F          =     0.1318
                                                R-squared         =     0.0091
                                                Root MSE          =     9.0696

                                   (Std. Err. adjusted for 389 clusters in id)
------------------------------------------------------------------------------
             |               Robust
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     treated |
         NJ  |  -3.104066   1.448499    -2.14   0.033    -5.951955   -.2561769
         1.t |  -2.523333   1.250619    -2.02   0.044    -4.982171   -.0644953
             |
   treated#t |
       NJ#1  |   2.972378   1.334611     2.23   0.027     .3484041    5.596352
             |
       _cons |   20.17333   1.360045    14.83   0.000     17.49935    22.84731
------------------------------------------------------------------------------

. reg fte i.treated L.fte if t==1, cluster(id) // ANCOVA

Linear regression                               Number of obs     =        389
                                                F(2, 388)         =      50.02
                                                Prob > F          =     0.0000
                                                R-squared         =     0.2817
                                                Root MSE          =     7.3454

                                   (Std. Err. adjusted for 389 clusters in id)
------------------------------------------------------------------------------
             |               Robust
         fte |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
     treated |
         NJ  |   1.374712   .9578786     1.44   0.152    -.5085701    3.257994
             |
         fte |
         L1. |    .485299   .0485207    10.00   0.000     .3899025    .5806954
             |
       _cons |   7.859902   1.224966     6.42   0.000       5.4515     10.2683
------------------------------------------------------------------------------

. table t treated , c(mean fte) // means

------------------------------
Feb. 1992 |  New Jersey = 1;  
= 0; Nov. |  Pennsylvania = 0 
1992 = 1  |       PA        NJ
----------+-------------------
        0 | 20.17333  17.06927
        1 |    17.65  17.51831
------------------------------

. di (17.518 - 17.069 ) - ( 17.650-20.173 )
2.972

. di (17.518 - 17.650) - .485299*(17.069 -20.173 )
1.3743681

Best Answer

Related Solutions

Difference-in-Differences Fixed Effects vs OLS in Regression

Difference-in-Differences Model for Multiple Time Periods

Related Question