Time-Series – How to Account for Future Treatment History in TSCS Data While Calculating ATE/ATC/ATT

causalitydifference-in-differencematchingpanel datatime series

I refer to this seminal paper (Matching Methods for Causal Inference with Time-Series Cross-Sectional Data) by Kosuke, In Song and Erik.

Context

In the paper, the authors propose the use of a two-layer adjustment to the typical DiD estimator – (1) treatment history matching and (2) refinement methods such as PS weights / matching, Mahalanobis distance – to adjust for bias and balance covariates. The crazy thing about this proposed methodology is that one is able to have a TSCS dataset where treatment is staggered and reversed, but still be able to find the causal quantity of interest (ATT, ATC, ATE) not only contemporaneously but also for a $T+F$ period, where $F$ is the number of time periods after the baseline period $T$ occurs.

The authors have since also released a code implementation of this on an R package PanelMatch.

Question

My question for those that are familiar with the paper is however this:

Say for example I am trying to understand how a sales rebate impact product sales quantity bought by a customer. In this case, $D=1$ when sales rebate is given for the month, and $D=0$ otherwise. I am interested in knowing both the contemporaneous effects and the effects in the subsequent months.

Given that treatment is allowed to be reversed, and that we allow for the calculation of causal quantity of interest for $F$ period (where if $F=0$ then contemporaneous), how do the authors account for variation in treatment statuses in the 'future' in both matched control and treated units when calculating a quantity of interest? Two examples of the variations below:

Control unit (i.e. $D=0$ when $t=T$) switches to $D=1$ at $T+f$, where $f<F$? Wouldn't this affect the construction of the counterfactual 'control'?
If treated units have different treatment status after $t=T$, where one treated unit may have the following sequence for $t=[T,T+3]$ of (1, 1, 0, 0) vs another which may have (1, 0, 0, 0)? (i.e., $F = 3$) wouldn't it affect the calculation of the average ATT across treated population?

The authors alluded to this under Page 10 of the paper in the following quote, but appear to chalk it up to an exception as far as I can tell:

When researchers are interested in a non-contemporaneous treatment effect (i.e. $F>0$), the ATT defined in Equation (8) does not specify the future treatment sequence. As a result, the matched control units may include those units who receive the treatment after time $t$ but before the outcome is measured at time $t+F$. Similarly, some treated units may return to the control conditions between time $t$ and time $t+F$.

In addition, I am also keen to understand that if there is a solution to this, whether this may have already been addressed in PanelMatch. This will be incredibly helpful to know.

Thank you all!

Best Answer

Control unit (i.e. $D=0$ when $t=T$) switches to $D=1$ at $T+f$, where $f<F$? Wouldn't this affect the construction of the counterfactual 'control'?

The function does not account for all possible future treatment sequences when refining the matched sets. In other words, the default behavior of PanelMatch() will allow for future policy switches. The authors already address this concern in their paper. For instance, the matched control units may include units who receive the treatment after time $t$ (i.e., policy switch). Similarly, some of the actual treated units may return to the control condition between time $t$ and time $t + F$. This is inevitable in settings where we want to estimate long-term causal effects.

But say you want to assess the causal effects of a rebate program where it was administered continuously over time. In that case, just be careful about your choice of $F$. Start by "eyeballing" the spatial-temporal distribution of the treatment variable over time; this is where the DisplayTreatment() function is your best friend. If you're interested in say the effect three months after the start of the rebate program, then most, if not all, of the treated units should remain in the treatment condition for at least 3 periods; note that as we expand the future sequence, then it's quite possible that some, or most, of the units no longer participate in the program. In essence, be careful when assessing long-term causal effects. The authors of that paper specifically caution users that a large value of $F$ may make the interpretation of causal effects a bit murky, especially if many units change their treatment status during the $F$ lead time periods.

If treated units have different treatment status after t=T, where one treated unit may have the following sequence for $t=[T,T+3]$ of (1, 1, 0, 0) versus another which may have (1, 0, 0, 0)? (i.e., $F = 3$) wouldn't it affect the calculation of the average ATT across treated population?

In short, it matters with respect to inference.

Say you specify $F = 3$. In the example provided, it appears that by the second lag the treatment has reversed for all units. Again, you need to be intentional with the lead = ... parameter. The values for $F$ should make sense given the treatment variation you're observing in the real world. Estimating short-term ATTs seems more appropriate given that treatment sequence.

I am also keen to understand that if there is a solution to this, whether this may have already been addressed in PanelMatch.

If you're concerned with the "switchers" and how they may affect inference, then proceed as recommended. Estimate the ATT of a stable policy change, where the counterfactual scenario is that a treated unit does not receive the treatment before the outcome is measured. They expound upon this alternative framework in their appendix. By employing this approach, you're saying it is not permissible for a treatment to reverse in the specified lead window. The treatment should be in place at least for $F$ time periods after the policy change. The specific answer you're looking for may be found in Appendix A on page 1:

We first constrain the matched set for each treated observation (i, t) such that the matched control units do not receive the treatment at least after time $t + F$.

But I actually prefer the explanation provided in the PanelMach documentation. When you force the treatment not to reverse in the user-defined future sequence (e.g., forbid.treatment.reversal = TRUE),

"...only matched sets for treated units where treatment is applied continuously in the lead window are included in the results."

That being said, if you know the rebate program is in place for at least a couple of months, then stick with the default settings (i.e., forbid.treatment.reversal = FALSE). It isn't fatal to observe a few treated units returning to the control condition within the specified lead window; that is a likely scenario when assessing long-term causal effects. Presently, I am working on a paper where I know the application of a policy is in effect for at least 3 months. I "know" this through domain-specific knowledge of the policy's administration and inspection of my treatment variation plot. With very few exceptions, most units return to the control condition after the third period. Given this variation, I do not impose any additional constraints on the matched sets; I'm interested in the ATT on the three periods.

On the other hand, if you're interested in the effects of "switching out" (opting out), then you may be interested in their alternative causal quantity of interest: the average treatment effect of policy reversal among the reversed (ART). Put simply, the ART is a comparison of units still in the rebate program with those that opted out (i.e., reversed). This may be worth pursuing, assuming you observe enough reversals.

In rare settings where all treated units return to the control condition at the same time, then it's difficult (if not impossible) to get matched sets for the ART. In other words, if all treated units opt out of the program by say, the fourth period, then policy reversal is deterministic and you needn't worry about it. If this is the case, then the ATT should be your preferred choice.

In sum, the best part about PanelMatch is it allows the user to define the causal quantity of interest. Since the policy variable is allowed to switch on and off multiple times over time, we should be very deliberate with our choice of $L$ and $F$. In the latter case, how far forward to go to look for policy effects is entirely up to you.

Context

Question

Best Answer

Related Solutions

Causality – Understanding the Positivity Assumption Required for Matching and ATT Estimand

Time-Series – How to Control for Confounders by Matching Based on Outcome Variables

Related Question