Solved – Collinearity in time dummies, fixed effect regression

fixed-effects-modelmulticollinearitystata

enter image description here

I am running a fixed effect panel regression with 81 groups x 20 periods, so approx 1620 (unbalanced) observations.

I use the following to create dummies:

*create timedummy  
tabulate refper, generate(refdummy)

which to visual inspection looks fine. But when running the xtreg, it removes 6 time dummies due to collinearity. Can anyone explain what is happening here?

Best Answer

From the labelling of your variables, I suspect that the regressors you include are lagged. But the fourth lag of a variable ("_L4") will only be available from the fifth period onwards, so that might explain why the first four time dummies drop out of your regression: you are not actually using any of the data in the first four time periods to estimate the model. Some of the other time dummies might drop out for similar reasons (maybe one of your variables is only available from period 3 onwards, meaning its fourth lag is only available from period 7 onwards). It looks like you might be regressing a differenced variable ("d") on the fourth lag of another differenced variable ("d" and "L4") in which case it makes sense for the first few periods not to be included in estimation. This automatically means you cannot estimate the time dummies for these period.

Related Solutions

Solved – Binary panel logistic regression (xtlogit fixed effects) is not converging in Stata, how to resolve

There are 2 possibilities. One is that Stata has found a perfect max and cannot get to a better point. This is pretty unlikely, but a fellow can still dream.

The second, and more likely, scenario is that the optimizer wound up in a bad concave part where the computed gradient and Hessian give a bad direction for stepping.

Here are some possible solutions. Use the gradient max option. If the gradient is zero, the optimizer found a max that may not be unique, but is a max. This is a valid result. If the gradient is not zero, that is not a valid result. You can try tightening up the convergence criterion, or try ltol(0) tol(1e-7) to see if the optimizer can work its way out of the bad region.

Also, sometime adding the difficult max option helps.

Solved – Time-invariant variables not being removed in Fixed Effects model. And feasibility of addional time dummies in Fixed Effect/Random modelling

Having an unbalanced panel is not a problem nowadays. In the past, when econometrics had to be done by hand, inverting matrices for unbalanced panels was more difficult but for computers this is not a problem. The only worry connected today with this is the question why the panel is unbalanced: is it due to attrition? If yes, is this attrition random or related to characteristics of the statistical units? For instance, in surveys people with higher education tend to be more responsive and stay in the panel longer for that reason.

Regarding the fixed effects model, have you checked whether the variables that are time-invariant in theory are actual not varying over time? Sometimes coding errors sneak in and then all the sudden a variable varies over time when it shouldn't. One way of checking this is to use the xtsum command which displays overall, between, and within summary statistics. The time-invariant variables should have a zero within standard deviation. If they don't then something went wrong in the coding.

Having a negative Hausman test statistics is a bad thing because the matrices that the test is built on are positive semi-definite and therefore the theoretical values of the test are positive. Negative values point towards model misspecification or a too small sample (related to this is this question).

If you cluster your standard errors you also need a modified version of the Hausman test. This is implemented in the xtoverid command. You can use it like this:

xtreg ln_r_prisperkg_Frst_102202 Dflere_mottak_tur i.landingsfylkekode i.kvartiler_ny markedsk_torsk gjenv_TAC_NØtorsk_år_prct lalder_fartøy i.fangstr r_minst_Frst_torsk gjenv_kvote_NØtorsk_fartøy_prct i.lengde_gruppering mobilitet, fe vce(cluster fartyid)
xtoverid

Rejecting the null rejects the validity of the assumptions underlying the random effects mode.

The xtset command only takes into account the unit id for fixed effects estimation. The time variable does not eliminate time fixed effects. So if you do

xtset id time
xtreg y x, fe

will give you the exact same results as

xtset id
xtreg y x, fe

The time variable is only specified for commands for which the sorting order of the data matters, for instance xtserial which tests for panel autocorrelation requires this. This has been discussed here. So if you want to include time fixed effects, you need to include the day dummies separately via i.day, for example. In this context, the season and year dummies make sense so it's good that you use them.