Solved – Confused about results from placebo diff-in-diff

causalitydifference-in-differenceregressionstata

I construct the simple placebo sample
enter image description here

I then construct a dummy for whether the year is after the placebo treatment year, 2012. I interact this dummy with the treatment dummy to construct the diff-in-diff variable, did.

Since the treatment and control groups have perfectly parallel trends across all periods, the regression of dependent_var on did should produce a coefficient of 0. Yet in Stata, the command "reg dependent_var did" gives me a coefficient of 2 with a p-value of 0.05. Results remain significant even with robust standard errors and year fixed effects.

What is going on? Am I interpreting the diff-in-diff coefficient incorrectly?

Best Answer

I will tender an answer since I have a better understanding of your problem and I am limited in my response in the comments.

Just to be clear, it is important you have the correct difference-in-differences (DD) setup before conducting your placebo test. I assume you want to estimate the following model

$$ y_{it} = \gamma T_{i} + \lambda Post_{t} + \delta(T_{i} \times Post_{t}) + \epsilon_{it}, $$

where you have repeated observations of cross-sectional unit $i$ across $t$ years. Note, $i$ could represent individuals, households, counties, states, et cetera. The variable $T_{i}$ is your treatment dummy, which aggregates $i$ into two distinct groups: one treatment group and one control group. The $Post_{t}$ dummy, indicates years after treatment in both groups. The interaction of these two dummies gives us an estimate of $\delta$, the DD coefficient.

To go back to my earlier comments for a moment. At the very least, the model requires these variables to obtain the DD estimate. You cannot forego the two main effects. In other words, you cannot just include a single treatment variable $D_{it}$ $(i \times t)$, without the corresponding effects for group and time.

I then construct a dummy for whether the year is after the placebo treatment year, 2012. I interact this dummy with the treatment dummy to construct the diff-in-diff variable, did.

This is correct. This is one way of conducting a placebo test. You are manipulating the time configuration. You should not be capturing a difference in trend in years when the policy/treatment/exposure is absent.

Let's talk briefly about one possible setup. Assume your placebo treatment year is 2012. In your case, you want to interact your treatment dummy with separate post-treatment indicators. Deconstructing $Post_{t}$ into separate dummies for all years (excluding one year to avoid collinearity) would result in the following

$$ y_{it} = \gamma T_{i} + \lambda_{1} (T_{i}*\mathbf{I}_{t = 2012}) + \lambda_{2} (T_{i}*\mathbf{I}_{t = 2013}) + \lambda_{3} (T_{i}*\mathbf{I}_{t = 2014}) + \epsilon_{it}. $$

This is a fancy way of saying: create a dummy variable for each year and interact it separately with the treatment variable. The interaction of the treatment indicator with year dummies is akin to obtaining a separate DD estimate by year. I assume 2012 is one of the years preceding treatment exposure. You could also test for a difference in trend in 2011 as well. Just remember what year you are leaving out!

Since the treatment and control groups have perfectly parallel trends across all periods, the regression of dependent_var on did should produce a coefficient of 0

In this case, you are estimating a unique effect for 2012, which should be indistinguishable from zero.

By the way, can I conclude that the parallel trends assumption (approximately) holds if the DD coefficient is large but insignificant?

The foregoing question was reproduced from the comments. The quick answer to this is, yes. The common trend assumption is often implicitly assumed, but in your case, you are subjecting it to an explicit test. You do not want to be capturing significant, non-zero effects before the treatment begins. I would also plot the evolution of the group trends over time. A visually clear parallelism should exist across treatment and control groups before the treatment begins. I wouldn't just do this test and move on. Show the trends too!

In your example, you are working with only four years worth of data, so the number of pretreatment years is scanty. Three or more pretreatment years is often preferred. To conclude, failing to capture a difference in trend in a pretreatment year is one way to isolate your treatment effect. In your case, effects manifest around the treatment/exposure period.