Solved – Panel regression: what to do when Hausman test fails and want to keep time invariant regressors

fixed-effects-modelhausmanpanel datarandom-effects-modelregression

I am running a panel data regression. First, I did a pooled OLS regression. Then I did a random effects (re) one. I carried out the Hausman test, and it refuted the null hypothesis (ie. I am discouraged to use random effects over fixed effects). So, I did the following:

(1) I carried out a Hausman-Taylor regression (in Stata, xthtaylor). Using xtoverid, I got that this model is OK when compared with the fixed effects one. However, this model displays no $\text{R}^2$ value. Furthermore, I was told that Hausman-Taylor is not a good model to use when you goal is to use the model to estimate outcomes.

I am trying to find out about the minimum distance estimator. It should be a procedure that allows me to combine fixed effects with time invariant regressors. Is there a Stata command for that? Do you know any reference on this topic?

Any suggestion on how to deal with my current problem is very welcome!

Best Answer

The following not-yet-published paper is, in my opinion, an excellent introduction and answer to the problem you bring up:

http://polmeth.wustl.edu/media/Paper/FixedversusRandom_1_2.pdf

To summarize, you can still proceed with the random effects approach, but you must first modify the model to account for the fact that the within-cluster and between-cluster effects differ (i.e., what the Hausman test indicates). You can do this by adding the cluster means of your predictor as a separate predictor in the model, and then optionally also applying within-cluster centering to the original predictor. The details of this procedure and the resulting interpretations are discussed at some length in the paper linked above.

Related Solutions

Solved – Time-invariant variables not being removed in Fixed Effects model. And feasibility of addional time dummies in Fixed Effect/Random modelling

Having an unbalanced panel is not a problem nowadays. In the past, when econometrics had to be done by hand, inverting matrices for unbalanced panels was more difficult but for computers this is not a problem. The only worry connected today with this is the question why the panel is unbalanced: is it due to attrition? If yes, is this attrition random or related to characteristics of the statistical units? For instance, in surveys people with higher education tend to be more responsive and stay in the panel longer for that reason.

Regarding the fixed effects model, have you checked whether the variables that are time-invariant in theory are actual not varying over time? Sometimes coding errors sneak in and then all the sudden a variable varies over time when it shouldn't. One way of checking this is to use the xtsum command which displays overall, between, and within summary statistics. The time-invariant variables should have a zero within standard deviation. If they don't then something went wrong in the coding.

Having a negative Hausman test statistics is a bad thing because the matrices that the test is built on are positive semi-definite and therefore the theoretical values of the test are positive. Negative values point towards model misspecification or a too small sample (related to this is this question).

If you cluster your standard errors you also need a modified version of the Hausman test. This is implemented in the xtoverid command. You can use it like this:

xtreg ln_r_prisperkg_Frst_102202 Dflere_mottak_tur i.landingsfylkekode i.kvartiler_ny markedsk_torsk gjenv_TAC_NØtorsk_år_prct lalder_fartøy i.fangstr r_minst_Frst_torsk gjenv_kvote_NØtorsk_fartøy_prct i.lengde_gruppering mobilitet, fe vce(cluster fartyid)
xtoverid

Rejecting the null rejects the validity of the assumptions underlying the random effects mode.

The xtset command only takes into account the unit id for fixed effects estimation. The time variable does not eliminate time fixed effects. So if you do

xtset id time
xtreg y x, fe

will give you the exact same results as

xtset id
xtreg y x, fe

The time variable is only specified for commands for which the sorting order of the data matters, for instance xtserial which tests for panel autocorrelation requires this. This has been discussed here. So if you want to include time fixed effects, you need to include the day dummies separately via i.day, for example. In this context, the season and year dummies make sense so it's good that you use them.

Solved – Hausman test for panel data

What you are looking for is the $\chi^2$ statistic produced at the end of the test. The null hypothesis of the Hausman test is that the fixed and random effects model do not differ significantly from each other. A significant test statistic means that we reject the null.

In your case

    Test:  Ho:  difference in coefficients not systematic

chi2(5) = (b-B)'[(V_b-V_B)^(-1)](b-B)
                          =      341.23
                Prob>chi2 =      0.0000

the value of the test statistic is 341.23 and the p-value shows that this value is significant at the 1% level and even below (Prob>chi2 = 0.0000).

The same holds for your second test but be sure that you pay attention to the note displayed at the top of the test where it warns you that the coefficients between the two models you test are not the same. Stata issues such warnings for good reason normally and choosing to ignore such warnings should be based on a deep understanding of what is going on and that it is a sound decision to ignore the warning (typically it is not).

Best Answer

Related Solutions

Solved – Time-invariant variables not being removed in Fixed Effects model. And feasibility of addional time dummies in Fixed Effect/Random modelling

Solved – Hausman test for panel data

Related Question