Solved – Test for parallel trends in difference-in-difference models with staggered treatment

difference-in-differencestata

Following a comment from a previous thread (below), I would appreciate if you may advise me on how to test for parallel trends in Stata for a DiD model with multiple groups and staggered treatment (i.e., policy reform). Almost all units become treated eventually.
Difference in Difference method: how to test for assumption of common trend between treatment and control group?

The original DiD model command is as follows:

xtreg outcome i.policy i.year, fe vce(cluster id)

A very useful discussion on this is found in the links below, however I could not implement it in Stata.
http://econ.lse.ac.uk/staff/spischke/ec533/did.pdf
http://econ.lse.ac.uk/staff/spischke/ec524/evaluation3.pdf

Thank you in advance.


@ThomasBilach. Thanks a lot for sharing this post. I am still confused about which variables to interact. In your post, T(ij) are interactions of the treatment indicator and time dummies. Two questions, please:
• Are all the values of T(ij) the same for each country (variable “treatment” in my data) or is T(ij) a variable that switches on the date the policy was implemented (variable “policy” in my data)?
• By time dummies, did Andy mean the standardized time variable (variable “time” in my data) or the date dummy (variable “date” in my data)?
enter image description here
Note that almost all countries are eventually treated.

Best Answer

In the post you referenced, the purpose of standardizing the time dimension is to produce what is often called a coefficient plot. In staggered adoption designs, researchers will often center on the time treatment commences. I only recommend standardizing the time dimension in settings where all units eventually become exposed to a treatment. In other words, all units in your sample are some amount of time periods relative to 0. Most software packages will help you 'dummy out' the individual periods, which represent the leads and lags of treatment. What confuses most people is how we standardize the time dimension when we also have a subset of units never experiencing a treatment. You did indicate in your post that almost all units eventually become treated, which implies you do have a viable control group. If so, then the control group does not get assigned any relative values; they should consistently equal 0. Due to the staggered implementation of the policy over time, a control unit isn't relative to any particular moment in time. I should also note that a standard interaction term isn't going to instantiate the individual leads and lags for you; I recommend creating them manually.

Suppose you wish to estimate the following:

$$ y_{it} = \mu_{i} + \lambda_{t} + \delta\text{Policy}_{it} + \epsilon_{it}, $$

where $\mu_{i}$ and $\lambda_{t}$ represent fixed effects for countries and years, respectively. The treatment dummy, $\text{Policy}_{it}$, should only 'turn on' for treated countries and only during their post-treatment years, 0 otherwise (see a previous post where I described the coding of the treatment dummy in greater detail). Suppose your post-period for one particular treated country was from the year 2015 onward. In this setting, your dummy will switch from 0 to 1 for that particular jurisdiction and for all $t$ years until the end of your panel (or until treatment is withdrawn). This model assumes the treatment's effects are immediate and permanent. In other words, it doesn't assess the dynamics of exposure. Again, $\text{Policy}_{it}$ is still your interaction term. It equals unity for any country-year combination where your policy is in effect, 0 otherwise. For countries never treated, it should equal 0 for all time periods it is under observation.

Now suppose you want to assess a time dependency in $y$'s response to the shock. In the classical difference-in-differences case, where all units experience a shock at the same time, this is very easy. You interact a treatment indicator for treated countries with post-treatment time dummies specific to treated and untreated countries. Software does most of the heavy lifting for you. In your setting, however, your exposure of interest is starting (and possibly ending) at different times in different countries. And, nearly all countries eventually undergo treatment. A variable delineating the "post" period is not useful in staggered adoption designs, in part because there is no well-defined period delineating pre- and post-treatment. I recommend you instantiate the policy variable(s) manually. A proper coding of a policy dummy will have all groups and time periods subject to the policy equal unity, 0 otherwise. This is your interaction term just defined a different way.

Instead of one discrete policy dummy, you can instantiate a series of pre- and post-exposure policy dummies (i.e., $D_{it}$'s). Here is one example involving one lead and two lags of the main policy dummy:

$$ y_{it} = \mu_{i} + \lambda_{t} + \delta_{+1} D_{i,t+1} + \delta D_{it} + \delta_{-1} D_{i,t-1} + \delta_{-2} D_{i,t-2} + \epsilon_{it}, $$

where $D_{it}$ is the immediate effect of the exposure for all countries undergoing treatment. To be clear, the immediate or instantaneous effect is equal to 1 for a treated country in the initial adoption year (i.e., year of change). In your fake example, your "year of change" (or should I say "day of change" to maintain consistency with your example) is January 4th for the first country and January 3rd for the second country. The control countries, on the other hand, remain consistently equal to 0 in all periods. Your "lags" investigate how effects are evolving since the initial adoption year (e.g., $D_{i,t-1}$, $D_{i,t-2}$, $D_{i,t-3}$, etc.). Oftentimes population level interventions aren't perceived immediately after policy adoption; it may take some time before the full effect is realized. If you include policy dummies for all periods before/after the initial year of change, then you're mapping out the full dynamic response of your outcome to the policy change. Note how each $D_{it}$ is an interaction term just defined in a different way. In words, each policy dummy is the product of a treatment indicator with a series of pre-/post-exposure year dummies. Again, the interaction is implicit in the coding of each policy variable, thus no explicit multiplicative term is required in the model formula. The constituent elements of the interaction term aren't required either as the relevant information is already captured by the fixed effects.

Stevenson and Wolfers 2006 employ a similar specification, reporting all coefficients from the initial adoption year onward in tabular form (see Table 1, p. 277). Later, they report event study estimates which plot coefficient estimates for all periods before and after the law change (see Figure 1, p. 280). Each plot is a separate regression for different age groups, but the structure is still the same. Reproducing their figure below, they plot each estimate of the policy dummy for all years relative to the law (policy) change.

coefficient plot

I can't offer further guidance regarding how many policy variables to include outside of the immediate effect. For explication purposes, I only included one policy lead (i.e., $D_{i,t+1}$), which is equal to unity if a country has ever been treated and is in the year before treatment adoption. You should expect your estimate of $\delta_{+1}$ to be bounded around zero. It is common in papers to see plots of the coefficients on each policy variable. Fully saturating your model is not necessary, but is often used to exploit the timing of the intervention. See the top answer here for a popular use case.

Related Question