Econometrics – Difference-in-Differences with Individual Level Panel Data

difference-in-differenceeconometricsfixed-effects-modelpanel data

What is the correct way to specify a difference in difference model with individual level panel data?

Here is the setup: Assume that I have individual-level panel data embedded in cities for multiple years and the treatment varies on the city-year level. Formally, let $y_{ist}$ be the outcome for individual $i$ in city $s$ and year $t$ and $D_{st}$ be a dummy for whether the intervention affected city $s$ in year $t$. A typical DiD estimator such as the one outlined in Bertrand et al (2004, p. 250) is based on a simple OLS model with fixed effect terms for city and year:

$$ y_{ist} = A_{s} + B_t + cX_{ist} + \beta D_{st} + \epsilon_{ist} $$

But does that estimator ignore the individual-level panel structure (i.e. multiple observations for each individual within cities)? Does it make sense to extend this model with an individual-level fixed effect term $S_i$? Many DiD applications use repeated cross-section data without the individual-level panel data.

Bertrand, Marianne, Esther Duflo, and Sendhil Mullainathan. 2004. “How Much Should We Trust Differences-in-Differences Estimates?” Quarterly Journal of Economics 119(1):249–75.

Best Answer

A nice feature of difference-in-differences (DiD) is actually that you don't need panel data for it. Given that the treatment happens at some sort of level of aggregation (in your case cities), you only need to sample random individuals from the cities before and after the treatment. This allows you to estimate $$ y_{ist} = A_g + B_t + \beta D_{st} + c X_{ist} + \epsilon_{ist} $$ and get the causal effect of the treatment as the expected post-pre outcome difference for the treated minus the expected post-pre outcome difference for the control.

There is a case in which people use individual fixed effects instead of a treatment indicator and this is when we don't have a well-defined level of aggregation at which the treatment occurs. In that case you would estimate $$ y_{it} = \alpha_i + B_t + \beta D_{it} + cX_{it}+\epsilon_{it} $$ where $D_{it}$ is an indicator for the post-treatment period for individuals who received the treatment (for example, a job market program which happens all over the place). For more information on this see these lecture notes by Steve Pischke.

In your setting, adding individual fixed effects should not change anything with respect to the point estimates. The treatment indicator $A_g$ will just be absorbed by the individual fixed effects. However, these fixed effects might soak up some of the residual variance and therefore potentially reduce the standard error of your DiD coefficient.

Here is a code example which shows that this is the case. I use Stata but you can replicate this in the statistical package of your choice. The "individuals" here are actually countries but they are still grouped according to some treatment indicator.

* load the data set (requires an internet connection)
use "http://dss.princeton.edu/training/Panel101.dta"

* generate the time and treatment group indicators and their interaction
gen time = (year>=1994) & !missing(year)
gen treated = (country>4) & !missing(country)
gen did = time*treated

* do the standard DiD regression
reg y_bin time treated did

------------------------------------------------------------------------------
       y_bin |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        time |       .375   .1212795     3.09   0.003     .1328576    .6171424
     treated |   .4166667   .1434998     2.90   0.005       .13016    .7031734
         did |  -.4027778   .1852575    -2.17   0.033    -.7726563   -.0328992
       _cons |         .5   .0939427     5.32   0.000     .3124373    .6875627
------------------------------------------------------------------------------

 * now repeat the same regression but also including country fixed effects
 areg y_bin did time treated, a(country)

------------------------------------------------------------------------------
       y_bin |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
        time |       .375    .120084     3.12   0.003     .1348773    .6151227
     treated |          0  (omitted)
         did |  -.4027778   .1834313    -2.20   0.032    -.7695713   -.0359843
       _cons |   .6785714    .070314     9.65   0.000       .53797    .8191729
-------------+----------------------------------------------------------------

So you see that the DiD coefficient remains the same when the individual fixed effects are included (areg is one of the available fixed effects estimation commands in Stata). The standard errors are slightly tighter and our original treatment indicator was absorbed by the individual fixed effects and therefore dropped in the regression.

In response to the comment
I mentioned the Pischke example to show when people use individual fixed effects rather than a treatment group indicator. Your setting has a well defined group structure so the way you have written your model that's perfectly fine. Standard errors should be clustered at the city level, i.e. the level of aggregation at which the treatment occurs (I haven't done this in the example code but in DiD settings the standard errors need to be corrected as demonstrated by the Bertrand et al paper).

Regarding the movers, they don't have much of a role to play here. The treatment indicator $D_{st}$ is equal to 1 for people who live in a treated city $s$ in the post-treatment period $t$. To compute the DiD coefficient, we actually just need to compute four conditional expectations, namely $$ c = \left[ E(y_{ist}|s=1,t=1) - E(y_{ist}|s=1,t=0)\right] - \left[ E(y_{ist}|s=0,t=1) - E(y_{ist}|s=0,t=0)\right] $$

So if you have 4 post-treatment periods for an individual who lives in a treated city for the first two, and then moves to a control city for the remaining two periods, the first two of those observations will be used in the computation of $E(y_{ist}|s=1,t=1)$ and the last two in $E(y_{ist}|s=0,t=1)$. To make it clear why identification comes from the group differences over time and not from the movers you can visualize this with a simple graph. Suppose the change in the outcome is truly only because of the treatment and that it has a contemporaneous effect. If we have an individual who lives in a treated city after the treatment starts but then moves to a control city, their outcome should go back to what it was before they were treated. This is shown in the stylized graph below.

You might still want to think about movers for other reasons though. For instance, if the treatment has a lasting effect (i.e. it still affects the outcome even though the individual has moved)

Related Solutions

Solved – group fixed-effects, not individual-fixed effects using plm in R

I have worked on similar projects and am confronting one right now. The way that we handle this is to put in a fixed effect for each village and then to cluster the standard errors by village. This is not a perfect solution, but is fairly standard practice.

The plm package in R and xtreg ..., fe command in Stata, and the traditional fixed effect (within) estimator are designed to follow individuals. I believe one of the names for the method that you want is called a hierarchical linear model.

The simplest implementation in R would be something like

myLM <- lm(y ~ x + v v.t*t, data=df)

where y is the outcome of interest, x is some set of controls, v is a factor variable for the villages, v.t is a binary (factor) variable indicating whether a village was treated, and t is an indicator for pre-post treatment.

For standard inference, it is typical and recommended to produce clustered standard errors use either the multiwayvcov package or clusterSEs package.

Another method for inference, and the preferred method in Bertrand, Duflo & Mullainathan, 2004 is to perform a placebo test, where you vary "treatment" across all villages, form an empirical CDF, and see where the effect of treatment for the truly treated village sits in that distribution. Note that this is roughly the same method recommended for inference with synthetic controls of Abadie, Diamond, and Hainmueller, and has ties back to Fisher's 1935 text.

Solved – Fixed effect model with household level and state level data

This is a fixed effects model. you should probably cluster your standard errors at the state level. I think it is reasonable to assume the unemployment rate is exogenous. Roughly speaking, any single state resident cannot significantly influence the unemployment rate while the unemployment rate can have significant influence on any single resident's behavior. Education, however could be endogenous since both BMI and education could be linked to an unobserved motivation factor.

If education is endogenous, unless $\hat \beta_{edu}$ and $\hat \beta_{ur}$ are completely uncorrelated, $\hat \beta_{ur}$ will be a biased estimate of the causal effect. from here you could either

Find a REALLY good reason for why education is exogenous (I don't know if this is possible)
include other covariates to control for unobserved confounders, male/female indicators, mother's education, father's education, income, etc.
Find a good instrument for education. Though it's outdated, Angrist and Krueger (1991) use season of birth to instrument education. Labor economists have both criticized and revised on this instrument but it's a start.
Construct some sort of structural equation, such as a simultaneous system, to account for the endogeneity of both BMI and education.

Overall, unless you are trying to publish something, I would just go with (2) from above.

Best Answer

Related Solutions

Solved – group fixed-effects, not individual-fixed effects using plm in R

Solved – Fixed effect model with household level and state level data

Related Question