Given that a dummy $\alpha_i$ for each country is included (or rather the deviation of each country from a common mean, which would be the way that Stata includes a constant term $\alpha_0$), this is a fixed effects model. You can estimate the fixed effects model either by subtracting the country specific mean of each variable from itself (this is called the within transformation) and use the demeaned variables in an OLS regression - or, as they do here, you can run OLS with a dummy for each country. The latter is referred to as the least squares dummy variables model.
They also write:
"Based on this conceptual framework, we examine the determinants of the pricing of sovereign risk both in non-crisis and crisis states for a range of advanced and emerging economies, using a standard panel model with country fixed effects" (p. 64)
The unobserved effects model is modeled as:
\begin{equation}
y = X\beta + u
\end{equation}
where
\begin{equation}
u = c_{i} + \lambda_{t} + v_{it}
\end{equation}
A one-way error model assumes $\lambda_{t} = 0$ while a two-way error allows for $\lambda \in \mathbb{R}$ and that is the answer to the first question.
The second question cannot be answered without more assumptions about the error structure or purpose of the study. Using Wooldridge (2010) chapters 10 and 11, generalize each of the assumptions to cover the temporal error structure as well. For example, when considering POLS, the critical assumption is $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}u\right) = 0$. In the chapter it is summarized as meeting the following conditions:
- $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}c\right) = 0$
- $\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}v\right) = 0$
However, if one does not assume $\lambda_{t} = 0$, i.e., two-way error model, a third condition must be satisfied for consistency of the POLS estimator:
\begin{equation}
\mathop{\mathbb{E}}\left(\mathbf{x}_{it}^{\prime}\lambda\right) = 0
\end{equation}
and so on.
In the case of estimating the fixed effects, one can go with LSDV (including indicators for the panel ID and temporal ID), but the dimension might become unfeasible fast. One alternative is to use the one-way error within estimator and include the time dummies such as one usually do with software that does not allow for two-way error models like Stata. A third and most efficient way is to estimate it with the two-way error within estimator.
\begin{equation}
y_{it} − \bar{y}_{i.} − \bar{y}_{.t} + \bar{y}_{..} = (x_{it} − \bar{x}_{i.} − \bar{x}_{.t} + \bar{x}_{..})\beta
\end{equation}
This approach is coded in several statistical packages such as the R package plm and correctly adjust the degrees of freedom to include the T - 1 additional parameters compared to the one-way error within estimator.
Most two-error way model estimators are not limited to balanced panels (only a handful). For short-panels running the one-way error within estimator with time dummies is feasible. As a side note, even if one gets the estimates for the temporal effects it is important to notice that as with the LSDV fixed effects for one-way error models these are not consistent as the estimates increase in number and length of panels.
I recommend Baltagi (2013) textbook for a pretty comprehensive explanation of the estimators for one-way and two-way error models.
References:
Baltagi, Badi H. 2013. Econometric analysis of panel data. Fifth Edition. Chichester, West Sussex: John Wiley & Sons, Inc. isbn: 978-1-118-67232-7.
Croissant, Yves, and Giovanni Millo. 2008. “Panel Data Econometrics in R : The plm Package.” Journal of Statistical Software 27 (2). doi:10.18637/jss.v027.i02.
StataCorp. 2017. Stata 15 Base Reference Manual. College Station, TX: Stata Press.
Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. Kindle Edition. The MIT Press. ISBN: 978-0-262-23258-8.
Best Answer
To see equality, let us first derive the FE estimator.
Define the residual-maker matrix \begin{align*} \underset{(M\times M)}{\mathbf{Q}}&:=\mathbf{I}_M-\mathbf{1}_M(\mathbf{1}_M'\mathbf{1}_M)^{-1}\mathbf{1}_M'\\ &=\mathbf{I}_M-\left(% \begin{array}{ccc} 1/M & \cdots & 1/M \\ \vdots & \ddots & \vdots \\ 1/M & \cdots & 1/M \\ \end{array}% \right)\mathbf{1}_M\mathbf{1}_M', \end{align*} where $M$ denotes the number of observations per individual unit in the panel.
Premultiplication with $\mathbf{Q}$ centers the $\mathbf{y}_i$ and $\mathbf{Z}_i$ around their averages over $m$, \begin{align*} \mathbf{Q}\mathbf{y}_i&=\mathbf{y}_i-\mathbf{1}_M\mathbf{1}_M'\mathbf{y}_i/M\\&=\mathbf{y}_i-\mathbf{1}_M\overline{y_{i}}. \end{align*} The also implies that every time invariant variable from the set of regressors $\mathbf{Z}_i$ turns into a column of zeros, and hence is eliminated from the data.
This is a serious disadvantage of the FE estimator. Consider the example of wage regressions for a panel of employees. Variables such as gender or schooling are of primary interest, but (typically) do not change over time (anymore).
As $\mathbf{Q}\mathbf{1}_M=\mathbf{0}$, we have that, using the error-component model $\mathbf{y}_i=\mathbf{Z}_i\mathbf{\delta}+\mathbf{1}_M\alpha_i+\mathbf{\eta}_{i}$, where $\eta_i$ denotes the $M$-vector of idiosyncratic time-varying errors, \begin{align*} \mathbf{Q}\mathbf{y}_i&=\mathbf{Q}\mathbf{F}_i\mathbf{\beta}+\mathbf{Q}\mathbf{\eta}_{i}\qquad i=1,\ldots,n\\ \tilde{\mathbf{y}}_i&\equiv\tilde{\mathbf{F}}_i\mathbf{\beta}+\tilde{\mathbf{\eta}}_{i}, \end{align*} where $\mathbf{F}_i$ is the $(M\times L_b)$-matrix of the observations on the time variant regressors. Stacking the observations over the $n$ units gives $$ \underset{(Mn\times 1)}{\tilde{\mathbf{y}}}:=\left(% \begin{array}{c} \tilde{\mathbf{y}}_1 \\ \vdots \\ \tilde{\mathbf{y}}_n \\ \end{array}% \right)\qquad\underset{(Mn\times L_b)}{\tilde{\mathbf{F}}}:=\left(% \begin{array}{c} \tilde{\mathbf{F}}_1 \\ \vdots \\ \tilde{\mathbf{F}}_n \\ \end{array}% \right) $$
The FE estimator is simply OLS applied to these $Mn$ observations: \begin{align*} \widehat{\mathbf{\beta}}_{\text{FE}}&=(\tilde{\mathbf{F}}'\tilde{\mathbf{F}})^{-1}\tilde{\mathbf{F}}'\tilde{\mathbf{y}} \end{align*}
To see the equality between FE and least squares dummy variables, stack the observations a bit further: \begin{equation} \underset{(Mn\times 1)}{\mathbf{y}}:=\left(% \begin{array}{c} \mathbf{y}_1 \\ \vdots \\ \mathbf{y}_n \\ \end{array}% \right)\;\underset{(Mn\times L_b)}{\mathbf{F}}:=\left(% \begin{array}{c} \mathbf{F}_1 \\ \vdots \\ \mathbf{F}_n \\ \end{array}% \right) \end{equation} and \begin{equation} \underset{(Mn\times 1)}{\mathbf{\eta}}:=\left(% \begin{array}{c} \mathbf{\eta}_1 \\ \vdots \\ \mathbf{\eta}_n \\ \end{array}% \right)\; \underset{(n\times 1)}{\mathbf{\alpha}}:=\left(% \begin{array}{c} \alpha_1 \\ \vdots \\ \alpha_n \\ \end{array}% \right). \end{equation}
Further, let $$ \underset{(Mn\times n)}{\mathbf{D}}:=\mathbf{I}_n\otimes\mathbf{1}_M=\left(% \begin{array}{ccc} \mathbf{1}_M & & \mathbf{O} \\ & \ddots & \\ \mathbf{O}& & \mathbf{1}_M \\ \end{array} \right) $$
Then, the linear panel data model from under an error component assumption in matrix notation is obtained as $$ \mathbf{y}=\mathbf{D}\mathbf{\alpha}+\mathbf{F}\mathbf{\beta}+\mathbf{\eta}, $$ a dummy-variable model.
That is, we can also obtain an estimator of $\mathbf{\beta}$ from an OLS regression on the regressors and $n$ individual specific effects.
Now, note that the Frisch-Waugh-Lovell Theorem says that the OLS estimator of $\mathbf{\beta}$ can be found by regressing $\mathbf{M}_{\mathbf{D}}\mathbf{y}$ on $\mathbf{M}_{\mathbf{D}}\mathbf{F}$, where $$\underset{(Mn\times Mn)}{\mathbf{M}_{\mathbf{D}}}:=\mathbf{I}-\mathbf{D}(\mathbf{D}'\mathbf{D})^{-1}\mathbf{D}'$$ Using symmetry and idempotency of $\mathbf{M}_{\mathbf{D}}$ gives \begin{equation} \widehat{\mathbf{\beta}}_{\text{LSDV}}=(\mathbf{F}'\mathbf{M}_{\mathbf{D}}\mathbf{F})^{-1}\mathbf{F}'\mathbf{M}_{\mathbf{D}}\mathbf{y} \end{equation}
Now, \begin{align*} \mathbf{M}_{\mathbf{D}}&=\mathbf{I}_{Mn}-(\mathbf{I}_n\otimes\mathbf{1}_M)[(\mathbf{I}_n\otimes\mathbf{1}_M)'(\mathbf{I}_n\otimes\mathbf{1}_M)]^{-1}(\mathbf{I}_n\otimes\mathbf{1}_M)'\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[(\mathbf{I}_n\otimes\mathbf{1}_M')(\mathbf{I}_n\otimes\mathbf{1}_M)]^{-1}(\mathbf{I}_n\otimes\mathbf{1}_M')\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[\mathbf{I}_n\otimes\mathbf{1}_M'\mathbf{1}_M]^{-1}(\mathbf{I}_n\otimes\mathbf{1}_M')\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)[\mathbf{I}_n\otimes M]^{-1}(\mathbf{I}_n\otimes\mathbf{1}_M')\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)\left[\mathbf{I}_n\otimes \frac{1}{M}\right](\mathbf{I}_n\otimes\mathbf{1}_M')\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-(\mathbf{I}_n\otimes\mathbf{1}_M)\left[\mathbf{I}_n\otimes \frac{1}{M}\mathbf{1}_M'\right]\\ &=\mathbf{I}_{n}\otimes\mathbf{I}_{M}-\mathbf{I}_n\otimes\mathbf{1}_M\frac{1}{M}\mathbf{1}_M'\\ &=\mathbf{I}_{n}\otimes\left(\mathbf{I}_{M}-\frac{1}{M}\mathbf{1}_M\mathbf{1}_M'\right)\\ &=\mathbf{I}_n\otimes\mathbf{Q} \end{align*}
Thus, \begin{align*} \mathbf{M}_{\mathbf{D}}\mathbf{F}&=(\mathbf{I}_n\otimes\mathbf{Q})\mathbf{F}\\ &=\left(% \begin{array}{ccc} \mathbf{Q} & & \\ & \ddots & \\ & & \mathbf{Q} \\ \end{array} \right)\mathbf{F}\\ &=\tilde{\mathbf{F}}, \end{align*} so that $$\widehat{\mathbf{\beta}}_{\text{LSDV}}=\widehat{\mathbf{\beta}}_{{FE}}.$$
Incidentally, while the notation works with balanced panel data, the result also goes through in the unbalanced case, as one can either check with more complicated notation or this numerical illustration:
Output: