Solved – Hausman test: Include or not year effects and/or interaction variables

hausmanhypothesis testingmultiple regressionpanel data

I have a problem when performing a Hausman test.

I have a panel dataset that has five panels. I am estimating the same model twice, once using quarterly and another using half-year data.

My dependent variable and some of my explanatory variables contain time-series data, which changes across individuals and time. However, I also have one time dummy, year, and two interaction terms with a dummy.

The year dummies do not change across panels, since I consider the same years for each panel, and do not systematically change along time, since in the case of quarterly data I have the same year 4 times per panel, and for half-years I have year twice per panel. Something like this

Year | Quarters

1998   1998q1
1998   1998q2
1998   1998q3
1998   1998q4
(...)

In the case of the interaction terms, the values do change across individuals and time, but only in 4 of the 5 panels, since for the first panel the variable is multiplied by zero. Therefore, all the values for the first panel are zero.

According to Wooldridge (2010, p.329) "Econometric analysis of cross sectional and panel data", in the section about comparing FE and RE, he says:

"Because the FE approach only identifies coefficients on time-varying
explanatory variables, we clearly cannot compare FE and RE
coefficients on time-constant variables. But there is a more subtle
issue: we cannot include in our comparison coefficients on aggregate
time-effects–that is, variables that change only across t. (…) the
problem with comparing coefficients on aggregate time effects is not
one of identification; we know RE and FE both allow inclusion of a
full set of time period dummies. The problem is one of singularity in
the asymptotic variance matrix of the difference between FE beta
estimate and RE beta estimate."

After experimenting I have the following problems:

1) If I regress only using the 'pure' variables (no interaction), with/without year effects I get the error I asked about here.

2) If I include the interaction terms, everything seems OK. But, is it OK including these interaction terms when at least in one panel its values do not change along t?

3) The result of the test from including/not including year effects are different, in the sense that in one case it's significant and in the other it's not. Independent of these results, should I include year effects (year dummies) in the model from which I get the estimates I use for the Hausman test?

Best Answer

1) Are you using the hausman or the xtoverid command? You can try the hausman command with the sigmamore option which sometimes resolves the negative test statistic. A negative test statistic can be due to small sample size and the sigmamore option takes this into account. It is also useful with respect to the point made by Wooldridge because this option bases the test on a common estimate of the disturbance variance.

2) It's not a problem for the FE estimator (and certainly not for the RE estimator) to have time-invariance in only one of the panels. Of course it is helpful to have within variation in all panels since the FE estimator relies on it but identification does not require within variability in all of the panels.

3) Given your particular research question (banks are dependent on macroeconomic factors that again relate to time) and that both FE and RE can make use of time dummies, you should include the time dummies. Wooldridge refers to comparisons between models where the RE model contains variables which are entirely time-invariant and thus cannot be used in the FE model - which then amounts to comparing two completely different models. You might find p. 4-10 of this lecture useful in which some of your questions about the Hausman test are discussed, including time dummies etc.