The reason for not having time-invariant variables is high correlation with fixed effects. Depending on whether the variable is generally time-invariant or perfectly time invariant, it may not necessarily drop from the regression. (Stata will drop variables with perfect collinearity, but generally not imperfect.) However, high multicollinearity in the model generally biases standard errors, affecting significance...so I am not sure what to make of your changed-sign variable.
As for the fixed-effects choice and modeling overall:
I do respectfully disagree with some of the previous answer. While spatial fixed effects--attached to the unit of analysis, such as states in a country, individuals in a treatment program, etc.--are not controlling for time, per se, they are still controlling for factors that do not change much over time, and thus they do need a cautious approach to including time-invariant regressors in the model. Example: say you wanted to test the effect of a country's geography, conceptualized as percentage of the nation that was mountainous, on guerrilla warfare prevalence over time (there are some articles on that very subject). An FE model would not be ideal, because, while FE would help account for the unobserved fixed effects of the countries studied (whatever makes Argentina, Canada, Uzbekistan, etc. unique that isn't included in the model), percent mountainous would most likely be unvarying, save perhaps for the occasional volcanic eruption or reconceptualization of what counts as mountainous. Multicollinearity would still be a problem. The fixed effect of each country and the percent mountainous would perfectly, or very nearly perfectly, covary.
(I believe that what the previous answer is referring to as fixed effects over time is more often accounted for by time dummies or trend variables than by using , fe (Stata code). For example, you are studying the crop production of Midwestern cities. 1975 had a massive drought, but you have no drought variable. Using time dummies would help account for that, incorporating the unmeasured unique effects of certain years, such as 1975)
If it is highly varying variables you want to examine, stick with fixed effects. If you want to know the effect of variables with much smaller/slower changes, try random effects. If you're not sure, try a Hausman test and see how the models compare.
Having an unbalanced panel is not a problem nowadays. In the past, when econometrics had to be done by hand, inverting matrices for unbalanced panels was more difficult but for computers this is not a problem. The only worry connected today with this is the question why the panel is unbalanced: is it due to attrition? If yes, is this attrition random or related to characteristics of the statistical units? For instance, in surveys people with higher education tend to be more responsive and stay in the panel longer for that reason.
Regarding the fixed effects model, have you checked whether the variables that are time-invariant in theory are actual not varying over time? Sometimes coding errors sneak in and then all the sudden a variable varies over time when it shouldn't. One way of checking this is to use the xtsum
command which displays overall, between, and within summary statistics. The time-invariant variables should have a zero within standard deviation. If they don't then something went wrong in the coding.
Having a negative Hausman test statistics is a bad thing because the matrices that the test is built on are positive semi-definite and therefore the theoretical values of the test are positive. Negative values point towards model misspecification or a too small sample (related to this is this question).
If you cluster your standard errors you also need a modified version of the Hausman test. This is implemented in the xtoverid
command. You can use it like this:
xtreg ln_r_prisperkg_Frst_102202 Dflere_mottak_tur i.landingsfylkekode i.kvartiler_ny markedsk_torsk gjenv_TAC_NØtorsk_år_prct lalder_fartøy i.fangstr r_minst_Frst_torsk gjenv_kvote_NØtorsk_fartøy_prct i.lengde_gruppering mobilitet, fe vce(cluster fartyid)
xtoverid
Rejecting the null rejects the validity of the assumptions underlying the random effects mode.
The xtset
command only takes into account the unit id for fixed effects estimation. The time variable does not eliminate time fixed effects. So if you do
xtset id time
xtreg y x, fe
will give you the exact same results as
xtset id
xtreg y x, fe
The time variable is only specified for commands for which the sorting order of the data matters, for instance xtserial
which tests for panel autocorrelation requires this. This has been discussed here. So if you want to include time fixed effects, you need to include the day dummies separately via i.day
, for example. In this context, the season and year dummies make sense so it's good that you use them.
Best Answer
You are using the fixed effects model, or also within model. This regression model eliminates the time invariant fixed effects through the within transformation (i.e., subtract the average through time of a variable to each observation on that variable).
And probably you are making confusion between individual and time fixed effects. Time fixed effects change through time, while individual fixed effects change across individuals.
Think of time fixed effects as a series of time specific dummy variables. For example, the dummy variable for year1992 = 1 when t=1992 and 0 when t!=1992. You see immediately that if you take the average of year1992 through time, it will be <1, so this dummy won't be eliminated. So you will get an estimate for the coefficient for the effect of being in 1992.
The thing is different for individual fixed effect. Also in this case, think of individual dummy variables. For example, the dummy for individual j = 1 along the whole time period you are considering. The average of j is exactly 1, you subtract its average through time and sim-sala-bim...it is eliminated by the within transformation. Therefore, you won't get an estimate of the effect of being individual j.