Solved – Why use control variables in differences-in-differences

I have a question on the differences-in-differences approach with the following standard equation:
$$
y= a + b_1\text{treat}+ b_2\text{post} + b_3\text{treat}\cdot\text{post} + u
$$
where treat is a dummy variable for the treated group and post.

Now, my question is simple: Why do most papers still use additional control variables? I thought that if the parallel trend assumption is correct, then we should not have to worry about additional controls. I could only think of 2 possible reasons for why to use control variables:

without them, trends would not be parallel
because the DnD specification attributes any differences in trends between treatment and control group at the time of treatment to the intervention (i.e. the interaction term treat*post) – when we don't control for other variables, the coefficient of the interaction may be over-/understated

Could anyone shed some light on this issue? Do my reasons 1) or 2) make sense at all? I don't fully understand the use of control variables in DnD.

Best Answer

without them [i.e., additional variables], trends would not be parallel

Yes, that's right. There may be unit-specific trends that you're not accounting for unless you add time-varying variables to the model.

Even if the parallel trends assumption is satisfied without additional variables, adding additional variables can increase the precision of your estimates, just as in other regressions. I think that this is part of what Michael Chernick has in mind.

Mostly Harmless Econometrics has a nice discussion that may be helpful. See especially pages 236-37.

Best Answer

Related Solutions

Solved – What if only control variables are significant in a differences-in-differences analysis

Solved – Accounting for Violation of Parallel Trend Assumption in Diff-in-Diff with Propensity-Score Matching

Related Question