I'm a bit confused. Shouldn't your regression be collinear?

Your command is:
`xtnbreg y i.TREAT##i.POST i.yrsfromtreatment i.year, fe robust`

Let:

- $I_{i,t,\tau}$ be a variable indicating whether individual $i$ is $\tau$ years from treatment at time $t$.
- $\mathrm{Treat}_i$ is an indicator as to whether $i$ at any point receives treatment.
- $\mathrm{Post}_{it}$ is an indicator as to whether $i$ has received treatment at time $t$ or in prior periods.

It looks that command is running the regression:
$$ y_{it} = b \mathrm{Treat}_{i}\mathrm{Post}_{it}+ \sum_{\tau=0}^\bar{\tau} c_\tau I_{i,t,\tau }+ u_i + v_t + \epsilon_{it} $$

What confuses me is that shouldn't this be collinear? If $\tau$ takes discrete values $0$, $1$, $2$, etc... then shouldn't

$$ \mathrm{Treat}_{i}\mathrm{Post}_{it} = \sum_{\tau = 0}^\bar{\tau} I_{i,t,\tau}$$
That is, if individual $i$ had treatment prior to or in year $t$ (i.e. $\mathrm{Treat}_{i}\mathrm{Post}_{it} = 1$), then they were either treated this year (i.e. $I_{i,t,0}$ = 1), or they were treated last year (i.e. $I_{i,t,1} = 1$), or treated the year before that ($I_{i,t,2} = 1$), etc....

I want to begin by addressing your model specification.

A country may enter and leave treatment at any point in time, so I don't think collinearity is a problem with the fixed effects.

In addition to fixed effects for *country* and *year*, your model includes two *main effects*. One main effect indexes treated countries (i.e., "war" indicator); the other main effect is a time dummy (i.e., "post-conflict" indicator).

Since the "timing" of treatment is not uniform across all entities, you do not have a situation that lends itself to the classical difference-in-differences (DD) approach with two groups and two discrete time periods. War may begin at different times in different countries. Likewise, countries may enter into and out of a wartime condition. Your "post-conflict" (i.e., post-treatment) variable is not well-defined. Your "war" main effect, which is constant within a country, is collinear with your *country* fixed effect $\eta_{i}$. You also include a *time dummy* which indexes post-treatment periods. This will also be collinear with your *year* fixed effect $\nu_{t}$. I do not claim that your model is inestimable, but rather your software will have to make adjustments for the model to run. If you include a "post-conflict" variable, software may drop one *additional* year dummy as a compromise for the inclusion of a full set of year effects. In sum, you can approach this in the same way but with a few less variables. Here is another formulation of the more general DD equation:

$$
y_{it} = \alpha + \phi y_{i,t-1} + \delta \textrm{Wartime}_{it} + \theta X_{it} + \eta_{i} + \nu_{t} + \epsilon_{it},
$$

where the model is the same as before, but with a treatment dummy representing countries in wartime years. The variable $\textrm{Wartime}_{it}$ *is* your interaction term (i.e., $War\cdot{PC}$). Again, $\textrm{Wartime}_{it}$ is coded explicitly; it is equal to one in precisely those country-year observations when a country enters into a wartime condition, 0 otherwise.

If war is treatment and militarized interstate dispute is the control, should I include observations that don't have either the treatment or control?

There are many ways to proceed and I hope other contributors will offer their input. In a 'generalized' DD framework, there is always some implicit treatment and control group comparison. Countries officially entering into the wartime condition can serve as your treatment group. Armed combat (i.e., war) is your *treatment*. Countries ensnared in militarized interstate disputes, but never engaging in armed conflict, can serve as controls. One way to proceed is to restrict your sample to only those countries engaged in militarized interstate disputes. In the years *preceding* wartime exposure, you observe the economic outputs of all countries. All observed countries are in the militarized dispute condition in the pre-exposure period; this is your baseline (no treatment) condition. In some year (but not precisely the same year for all treated countries), wartime exposure affects a subset of countries, *but not others*. This is what I was referring to earlier when I stated that a country cannot be in *both* treatment and control groups in the same country-year period. Put differently, the country is either at war, or in some condition of militarized interstate conflict/negotiation. My only concern is, is it possible to be at war with one country and simultaneously involved in some militarized interstate conflict with another *possibly neighboring* country? I am sure you've considered this possibility. In my estimation, *war* is a clearly defined exposure. You could make the case that war is qualitatively different than being in a state of conflict/negotiation. I am not familiar with the details of your study, but you could also investigate different types of treatment. The economic health of a nation is undoubtedly influenced by the length of time at war, or even more so by the 'intensity' of that war.

I also imagine you observe countries *before* the onset of militarized interstate conflicts as well. In this case, you observe countries throughout several *phases*. In other words, there is a peacetime epoch, a militarized interstate dispute epoch, and a wartime epoch. I think you are most concerned with how to incorporate these different 'conflict phases' into your model. You make a solid argument that a declaration of war is as good as randomly assigned *conditional* on the country being in a state of militarized interstate conflict.

For observations that fight a war and a militarized interstate dispute at the same time, can I code them as a war since they were exposed to treatment, or should I drop them entirely?

I assume "at the same time" implies a country with multiple conflicts, such that they are “at war” with one country and also embroiled in a militarized interstate dispute with another. Can you code a specific 'country-year' as "at war" if involved in multiple conflicts? If yes, then I would investigate the effects of being "at war" with, and without, the 'multi-conflict' countries in your sample.

As noted earlier, you still have a well-defined exposure. However, your treatment is possibly confounded by the fact that some countries might be more predisposed to war than others. Is a country more likely to declare war if it was engaged in *more than one* interstate conflict in the pre-exposure period? I might suspect a country would be more bellicose if involved in multiple conflicts. Moreover, I might suspect a country's spatial proximity to a nearby belligerent government would also affect their exposure status. The contemporaneous cross-correlation across your $i$ countries might be a concern. These are my substantive musings.

I also wonder if excluding observations in the peacetime epoch is deliberately reducing the number of pre- or post-treatment observations in your panel (I say *post*-treatment as in *beyond conclusion of the war*). Some countries may only be involved in a militarized interstate dispute for a couple of years prior to a war, while others may be involved in interstate conflict(s) for decades. I would proceed by assessing the group trends in your economic output variables across these different epochs.

The more I think about it, the more I think I will use several different timings of GDP growth, such as current year, following year, following two years, etc. I also think a good placebo test would be lag GDP growth, as fighting a war should not affect the previous year's growth unless there is something wrong with the design.

You *should* consider adjusting the time configuration of your $\textrm{Wartime}_{it}$ variable to monitor how your treatment affects your outcome in different epochs. Note, the coefficient on $\textrm{Wartime}_{it}$ is the *immediate* (contemporaneous) effect of treatment. If you also acquired data *after* countries moved out of the wartime condition, then you can adjust your treatment variable by one or more periods to assess the persistence of wartime exposure on economic outputs.

This is known in the literature as 'lagging' your *treatment indicator*. One way to proceed is the following:

$$
y_{it} = \alpha + \phi y_{i,t-1} + \sum_{\tau = 0}^{m}\delta_{-\tau} \textrm{Wartime}_{i,t - \tau} + \theta X_{it} + \eta_{i} + \nu_{t} + \epsilon_{it},
$$

where the sum on the right-hand side allows for $m$ lags (i.e., $\delta_{-1}, \delta_{-2},..., \delta_{-m}$). These are additive, time-varying treatment effects. Note, when $-\tau = 0$, your estimate of $\delta_{0}$ is your contemporaneous effect; it is precisely the same wartime exposure period from before. If you are interested in how war affects a country’s economic health once it concludes, then lagging your treatment variable is one way to do it. You might also be interested in the *anticipatory effects* of war, in which case you could also incorporate a lead(s) of your treatment variable. Note, there are many variations of this general theme, so make sure you can justify your approach. Ultimately, you should be guided by your particular discipline and overall understanding of how the treatment affects your outcome. This brings me to my next concern.

Incorporating time-varying **independent** (treatment) variables is different than the inclusion of lagged **dependent** variables as covariates. Your model explicitly includes a lag of your outcome as a regressor. You can do this, but it introduces bias. In your case, you incorporate *both* the unobserved country-specific effect $\eta_{i}$ *and* a lagged dependent variable on the right-hand side of your equation. Consistent estimation is compromised when you condition on $\eta_{i}$ and $y_{i,t-1}$. Think about what would happen if you *demeaned* or *differenced* your equation; you would remove the fixed effect, but the "demeaned/differenced residual" is necessarily correlated with your lagged dependent regressor. Your lagged outcome on the right-hand side is *not* distributed independently of the error term. This bias is more pronounced with small *T* and an autocorrelated error process. Software has fixes for this, and it may require you to go looking for suitable instruments.

Another way to proceed is to run two equations. First, run your fixed effects model as before and drop the GDP lag. Then, rerun the same model with the GDP lag, but now drop the unobserved country-specific effect. Applied econometricians show fixed effects and lagged dependent variables estimates have a useful "bracketing property" (see Angrist and Pischke, 2009). In other words, it bounds the causal effect of interest. See pages 182-186 of this online resource if you do not have access to their book.

I hope this clears things up!

## Best Answer

Yes, it makes sense and in this case the coefficient for the interaction of the post-treatment indicator and the treatment variable gives you the effect on the outcome that results from an increase in the treatment intensity. An example of this is the paper by Acemoglu, Autor and Lyle (2004), where they estimate the effect of World War II on female labor supply in the US. In their model

$$y_{ist} = \delta_s + \gamma d_{1950} + X'_{ist}\beta + \varphi \left(d_{1950}\cdot m_s\right) + \epsilon_{ist}$$

$y$ are weeks worked by female $i$, in state $s$, in year $t$. They have two periods, 1940 and 1950 where $d_{1950}$ is a dummy for the latter year, $X$ is a vector of individual characteristics, $\delta_s$ are state dummies, and $m_s$ is the mobilization rate in each state. Their interaction estimates whether states with higher mobilization rates during WWII saw a stronger rise in females' weeks worked from 1940 to 1950. This is given by the coefficient $\varphi$.

This is also a difference in differences (DiD) setting with variable treatment intensity since mobilization rates $m_s$ are continuous and differ across states. They get a point estimate of 11.2 for $\varphi$, i.e. a 10 percentage points increase in the mobilization rate increased female labor supply by 1.1 weeks (note that their mobilization rate is between 0 and 100). States with higher treatment intensity therefore saw a bigger increase in female labor market participation as a result of the "treatment".