Solved – FGLS and time fixed effects

autocorrelationfixed-effects-modelgeneralized-least-squaresleast squarespanel data

Context: I am performing growth regressions on a panel data set in R, including individual- and time fixed effects. Estimating with OLS delivers results that seem to suffer form serial correlation. For that reason I want to reestimate my model with FGLS.

Problem: The R package "plm" does provide FGLS estimations via pggls. However estimation is only possible including individual OR time fixed effects (not both at the same time). I got the recommended literature (Wooldridge "Econometric Analysis of Cross Section and Panel Data"), but it was not really helpful on this issue.
Certainly I could easily work around that by including dummies for time periods.

Question: Is it possible to include time-fixed and individual effects at the same time while estimating a model with FGLS? Would including both violate some statistical properties?

Best Answer

Yes including both violates certain statistical properties. The pggls documentation indirectly states exactly that:

Conversely, this structure is assumed identical across groups and thus general FGLS estimation is inefficient under groupwise heteroskedasticity. Note also that this method requires estimation of T(T+1)/2 variance parameters, thus efficiency requires N > > T (if effect="individual", else the opposite).

FGLS requires estimator of covariance matrix of regression disturbances. For individual effects panel data model:

$$y_{it} = x_{it}\beta + c_i + u_{it},$$

where $c_{i}$ is an individual effect, it is assumed that $u_{it}$ are independent from $u_{jt}$ for each $i\neq j$, so you are left with $\frac{T(T+1)}{2}$ covariances $\text{cov}(u_{it},u_{is})$.

For the time effects panel data model:

$$y_{it} = x_{it}\beta + d_t + u_{it},$$

where $d_t$ is a time effect it is assumed that $u_{it}$ are independent from $u_{is}$ for each $t\neq s$, so you are left with $\frac{N(N+1)}{2}$ covariances $\text{cov}(u_{it},u_{is})$.

Now if you have both time and individual effects:

$$y_{it} = x_{it}\beta + c_i + d_t + u_{it},$$

the question arises which covariances $\text{cov}(u_{it},u_{js})$ are zero? If you assume that all of them are not zero, you are left with $\frac{NT(NT+1)}{2}$ unknown parameters with $NT$ data points, which makes the problem non feasible.

Note 1. Independence assumption mention can be relaxed to zero covariances.

Note 2. In both individual and time effect there $NT$ data points. FGLS is an asymptotic procedure and it requires that the number of data points must increase, while the number of parameters remains fixed. For the individual effects hence the $N$ must increase, and for time effects $T$ must increase. More often than not these requirements are satisfied for totally different data sources, hence the answer to your problem depends on your data source. Is is more likely that $N$ is increasing or $T$? Since you mention time dummies, I suspect that $N$ is increasing, hence I suggest using time dummies.