This is a model in which you control for a state-by-state linear time trend as well as variations from that trend that are common to all states at each individual time.
To see this, consider some synthetic data generated according to this model. (The method to create them is described at the end of this post.) It consists of five observations in each of three states over eight consecutive years. No covariates $X_{ist}$ are involved, because their inclusion will shed no light on the issue of modeling time effects.
Because you are interested ultimately in the effects of the $D_{st}$ variable, this plot distinguishes the symbols by its values. They occur only in years 4 and 5. On the face of it, they are not unusual.
We could fit a model with linear time trends in each state, controlling for $D_{st}$:
$$y_{ist}=\alpha_{0s}+\alpha_{1s}t + \quad\quad\quad + \theta D_{st} + \epsilon_{ist}$$
The $\lambda_t$ term is omitted.
Here are the fitted trends, one per state, controlling for $D_{st}$:
You can see the states do experience different rates of change over time. Morever, there is some collective variation around those fitted lines. In particular, the values for State 1 in years 5 and 6 are unusually high--and these are the ones associated with $D_{st}=1.$ Should we attribute this to a real effect or to some form of variation that affects all states, independently of $D_{st}$?
Let's examine the residuals:
I have collected the residuals into boxplots (a) by time (the black-and-white wide boxplots in the background) and (b) by time and state (the colored narrower boxplots in the foreground). You can see that the residuals significantly change from one time to the next, but those for all states change in the same manner. We needed to control for this common year-to-year variation in order to determine that the unusually high values for $D_{st}=1$ in years 4 and 5 in state 1 are meaningful.
The software might complain when you fit the model. This is because the presence of the $\lambda_t$ term, which provides a separate mean value for each year, effectively establishes a "baseline" to which all the states are compared. This creates a redundancy, exactly in the same way any categorical variable creates one, requiring us to interpret all temporal changes as being relative to the baseline. The OLS procedure in R
, lm
, elects not to fit a slope for the last state:
lm(formula = Value ~ -1 + State + State:Time + Year + D.st, data = X)
(Year
is a categorical version of the numerical Time
variable.)
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value Pr(>|t|)
StateS.1 1.58315 0.22167 7.142 1.17e-10 ***
StateS.2 2.35555 0.21895 10.759 < 2e-16 ***
StateS.3 2.41142 0.18867 12.781 < 2e-16 ***
Year2 2.10770 0.19827 10.631 < 2e-16 ***
Year3 -0.20172 0.20507 -0.984 0.327
Year4 0.11881 0.23027 0.516 0.607
Year5 2.59317 0.24377 10.638 < 2e-16 ***
Year6 2.18162 0.24749 8.815 2.42e-14 ***
Year7 3.85025 0.26703 14.419 < 2e-16 ***
Year8 2.26431 0.28843 7.851 3.38e-12 ***
D.st 5.45442 0.23999 22.728 < 2e-16 ***
StateS.1:Time -1.14550 0.05237 -21.874 < 2e-16 ***
StateS.2:Time -0.67605 0.05237 -12.909 < 2e-16 ***
StateS.3:Time NA NA NA NA
Incidentally, the coefficient of $D_{st}$ used to generate these data was set at $\theta=6$. The OLS fit in this example is $\hat\theta=5.45\pm 0.24.$ That's pretty accurate.
In might be helpful to see how these data were generated. I created arrays to hold the values of the parameters and used those to compute the Value
field in a dataframe X
of rows (indexed by $i$) that contain the State
($s$), numerical Time
($t$), and 0-1 numerical d.st
codes ($D_{st}$):
X$Value <- with(X,states.intercept[State] +
states.slope[State] * Time +
effects.time[Time] +
effects.main * c(d.st) +
errors)
X$Year <- factor(X$Time) # Used by `lm` for individual time terms lambda_t
Here, states.intercept
is $\alpha_{0s}$, states.slope
is $\alpha_{1s}$, effects.time
is $\lambda_t$, effects.main
is $\theta$, and errors
are iid Normally distributed random values to realize $\epsilon_{ist}$.
Best Answer
If you have individual fixed effects, your estimate of the state dummy will be based upon within individual variation (i.e. it will be based upon the people that move across state lines). If no one switches state, then the state dummy will not be identified.