Incidental Parameter – Incidental Parameter Problem in Nonlinear Regression

biasfixed-effects-modelnonlinear regression

I always struggle to get the true essence of the incidental parameter problem. I read in several occasions that the fixed effects estimators of nonlinear panel data models can be severely biased because of the "well-known" incidental parameter problem.

When I ask for a clear explanation of this problem the typical answer is: Assume that the panel data has N individuals over T time periods. If T is fixed, as N grows the covariate estimates become biased. This occurs because the number of nuisance parameters grow quickly as N increases.

I would greatly appreciate

  • a more precise but still simple explanation (if possible)
  • and/or a concrete example that I can work out with R or Stata.

Best Answer

In FE models of the type $$y_{it} = \alpha_i + \beta X_{it} + u_{it}$$ $\alpha$ is the incidental parameter, because theoretically speaking, it is of a secondary importance. Usually, $\beta$ is the important parameter, statistically speaking. But in essence, $\alpha$ is important because it provides useful information on the individual intercept.

Most of the panels are short, i.e., T is relatively small. In order to illustrate the incidental parameter problem I will disregard $\beta$ for simplicity. So the model is now: $$y_{it} = \alpha_i + u_{it} \quad \quad u_{it}\sim iiN(0,\sigma^2)$$ So by using deviations from means method we have $\hat{u}_{it} = y_{it}-\bar{y}_i$ - and that's how we can get $\alpha$. Lets have a look on the estimate for $\sigma^2$: $$\hat{\sigma}^2 = \frac{1}{NT}\sum_i\sum_t (y_{it}-\bar{y}_i)^2 = \sigma^2\frac{\chi_{N(T-1)}^2}{NT} \overset p{\to} \sigma^2\frac{N(T-1)}{NT} = \sigma^2\frac{T-1}{T}$$

You can see that if T is "large" then the term $\frac{T-1}{T}$ disappears, BUT, if T is small (which is the case in most of the panels) then the estimate of $\sigma^2$ will be inconsistent. This makes the FE estimator to be inconsistent.

The reason $\beta$ is usually consistent because usually N is indeed sufficiently large and therefore has the desired asymptotic requirements.

Note that in spatial panels for example, the situation is opposite - T is usually considered large enough, but N is fixed. So the asymptotics comes from T. Therefore in spatial panels you need a large T!

Hope it helps somehow.