Having an unbalanced panel is not a problem nowadays. In the past, when econometrics had to be done by hand, inverting matrices for unbalanced panels was more difficult but for computers this is not a problem. The only worry connected today with this is the question why the panel is unbalanced: is it due to attrition? If yes, is this attrition random or related to characteristics of the statistical units? For instance, in surveys people with higher education tend to be more responsive and stay in the panel longer for that reason.
Regarding the fixed effects model, have you checked whether the variables that are time-invariant in theory are actual not varying over time? Sometimes coding errors sneak in and then all the sudden a variable varies over time when it shouldn't. One way of checking this is to use the xtsum
command which displays overall, between, and within summary statistics. The time-invariant variables should have a zero within standard deviation. If they don't then something went wrong in the coding.
Having a negative Hausman test statistics is a bad thing because the matrices that the test is built on are positive semi-definite and therefore the theoretical values of the test are positive. Negative values point towards model misspecification or a too small sample (related to this is this question).
If you cluster your standard errors you also need a modified version of the Hausman test. This is implemented in the xtoverid
command. You can use it like this:
xtreg ln_r_prisperkg_Frst_102202 Dflere_mottak_tur i.landingsfylkekode i.kvartiler_ny markedsk_torsk gjenv_TAC_NØtorsk_år_prct lalder_fartøy i.fangstr r_minst_Frst_torsk gjenv_kvote_NØtorsk_fartøy_prct i.lengde_gruppering mobilitet, fe vce(cluster fartyid)
xtoverid
Rejecting the null rejects the validity of the assumptions underlying the random effects mode.
The xtset
command only takes into account the unit id for fixed effects estimation. The time variable does not eliminate time fixed effects. So if you do
xtset id time
xtreg y x, fe
will give you the exact same results as
xtset id
xtreg y x, fe
The time variable is only specified for commands for which the sorting order of the data matters, for instance xtserial
which tests for panel autocorrelation requires this. This has been discussed here. So if you want to include time fixed effects, you need to include the day dummies separately via i.day
, for example. In this context, the season and year dummies make sense so it's good that you use them.
Interaction analyses and stratified (or separate) analyses have the same objective: predict and measure trend-line differences across levels of an interaction variable. Despite this shared objective, the two approaches rarely lead to the same estimates or inference.
Stratified/separate analyses are a much broader class of models and thus require a larger sample to obtain modestly powered inference. Interaction models, by contrast, are much more constrained and thus are more efficient when the modeling assumptions are true. If the analysis is decently powered, the stratified model is superior since there are fewer modeling assumptions, but this is rarely the case.
Best Answer
Fixed effect panel regression models involve subtracting group means from the regressors. This means that you can only include time-varying regressors in the model. Since firms usually belong to one industry the dummy variable for industry does not vary with time. Hence it is excluded from your model by Stata, since after subtracting the group mean from such variable you will get that it is equal to zero.
Note that Hausman test is a bit tricky, so you cannot solely base your model selection (fixed vs random effects) with it. Wooldridge explains it very nicely (in my opinion) in his book.