Solved – Fixed effects and time-invariant variables

fixed-effects-modelpanel dataself-study

I understand that fixed effects removes time-invariant variables, e.g. gender dummy, race dummies, etc. What will fixed effects do to variables that do not vary much over time?

For example:

Wage = Education + Age + Current Religion

Sample is individuals over 25.

It is likely that for the majority of the sample, education (in number of years) and current religion would not change but for a small proportion of the sample, it may change. What would happen? Would those variables simply be insignificant?

Best Answer

Fixed effects as the within estimator

An estimator with fixed effects is somewhat intuitively called the "within" estimator.

  • If you have individual fixed effects, you'll estimate your coefficients using variation within each individual, that is, your coefficients will be based upon an individual's variation over time.
  • If you have time fixed effects, you'll estimate your coefficients using variation within each time period, that is, your coefficients will be based upon cross-sectional variation at each time period.

If you add individual fixed effects, your estimate of the coefficient on current religion will be based upon individuals that change their religion. If this almost never happens, then using individual fixed effects won't work. There may also be some issues if individuals that change their religion are fundamentally different.

In the extreme, if individuals never change their religion, then you have a multicollinearity problem. There is no way to distinguish individual fixed effects from cross-sectional variation based upon religion. Once you take out individual fixed effects, there's no variation in religion! (Note that you'll effectively have this kind of problem if switching religion is sufficiently rare, even if technically, there is some within individual variation.)

Further note on fixed effects

Mechanically what fixed effects are doing is demeaning your left hand side variables and your right hand side variables based upon the mean of the individual or time period (depending if you have individual or time period fixed effects respectively) and then running the usual regression.

Quick comment on your regression

My big concern is that cross-sectionally, the religion variable is going to pickup certain sociodemographic variation. Eg. in the U.S., Indians (i.e. from India) disproportionately tend to be engineers, doctors, etc... Hindu may forecast engineer in Silicon Valley. You'll be picking up skills that aren't measured by the imprecise "education" variable.

Adding fixed effects, it's really unclear what you're going to be picking up. Who switches to Hindu in the U.S.? Will you disproportionately pickup Yoga enthusiasts? Individuals that marry Indians?

I don't know... it's just hard for me to see how religion matters except that it will be related to underlying but poorly observed skills. If you're just trying to forecast wages, this may be fine, but it's going to be extremely difficult/impossible to talk about causal effects. I wouldn't believe it.

Related Question