These are also called "individual-specific intercepts", because one way to estimate the FE model is to a "least-squares dummy variables regression", in which one regresses $y$ on $x$ and a $n$ dummy variables where each individual on the panel has one dummy that takes the values one if an observation belongs to that person (household, unit, firm,...). The $\hat\alpha_i$ then estimate these intercepts, which may then be interpreted as usual intercepts in regressions, with the only difference that each intercept is specific to a single unit.
The unobserved effects model is modeled as:
\begin{equation}
y = X\beta + u
\end{equation}
where
\begin{equation}
u = c_{i} + \lambda_{t} + v_{it}
\end{equation}
A one-way error model assumes $\lambda_{t} = 0$ while a two-way error allows for $\lambda \in \mathbb{R}$ and that is the answer to the first question.
The second question cannot be answered without more assumptions about the error structure or purpose of the study. Using Wooldridge (2010) chapters 10 and 11, generalize each of the assumptions to cover the temporal error structure as well. For example, when considering POLS, the critical assumption is $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}u\right) = 0$. In the chapter it is summarized as meeting the following conditions:
- $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}c\right) = 0$
- $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}v\right) = 0$
However, if one does not assume $\lambda_{t} = 0$, i.e., two-way error model, a third condition must be satisfied for consistency of the POLS estimator:
\begin{equation}
\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}\lambda\right) = 0
\end{equation}
and so on.
In the case of estimating the fixed effects, one can go with LSDV (including indicators for the panel ID and temporal ID), but the dimension might become unfeasible fast. One alternative is to use the one-way error within estimator and include the time dummies such as one usually do with software that does not allow for two-way error models like Stata. A third and most efficient way is to estimate it with the two-way error within estimator.
\begin{equation}
y_{it} − \bar{y}_{i.} − \bar{y}_{.t} + \bar{y}_{..} = (x_{it} − \bar{x}_{i.} − \bar{x}_{.t} + \bar{x}_{..})\beta
\end{equation}
This approach is coded in several statistical packages such as the R package plm and correctly adjust the degrees of freedom to include the T - 1 additional parameters compared to the one-way error within estimator.
Most two-error way model estimators are not limited to balanced panels (only a handful). For short-panels running the one-way error within estimator with time dummies is feasible. As a side note, even if one gets the estimates for the temporal effects it is important to notice that as with the LSDV fixed effects for one-way error models these are not consistent as the estimates increase in number and length of panels.
I recommend Baltagi (2013) textbook for a pretty comprehensive explanation of the estimators for one-way and two-way error models.
References:
Baltagi, Badi H. 2013. Econometric analysis of panel data. Fifth Edition. Chichester, West Sussex: John Wiley & Sons, Inc. isbn: 978-1-118-67232-7.
Croissant, Yves, and Giovanni Millo. 2008. “Panel Data Econometrics in R : The plm Package.” Journal of Statistical Software 27 (2). doi:10.18637/jss.v027.i02.
StataCorp. 2017. Stata 15 Base Reference Manual. College Station, TX: Stata Press.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Kindle Edition. The MIT Press. ISBN: 978-0-262-23258-8.
Best Answer
If the observations for each individual are demeaned, then the only variation left is how much each individual deviates from their mean over time.
Think about it this way. In the fixed effects model $y_{it} = x_{it}\beta + \alpha_i + u_{it}$, where $\alpha_i$ represents a dummy for each entity in the data, the dummies are all averages for each entity. These are partialled out, so only each $i$'s deviation from its average is used to estimate $\beta_i$.
The demeaned model does the same thing. Subtracting the $i$'s mean from $i$'s observation centers it around 0. The only variation left is each $i$'s deviation from its own mean. If $i$ has no change over time, $x_i - \bar x = 0$, so does not contribute to the estimate of $\beta$. If $i$ has changes over time, this gets picked up in $\beta$. So only changes within $i$ get estimated.