Solved – does serial correlation have something to do with endogeneity

autocorrelationendogeneityregression

I'm a beginner of econometrics, and I've construed that endogeneity is caused by omitted variable bias, measurement error, and reverse causality, and it makes OLS estimator be biased.

And also I've learned that serial correlation which refers to correlation among the error terms makes variance-covariance matrix not to be identity matrix, which may eventually makes OLS estimator be inefficient.

However, famous free youtube channel 'ben lambert' offers lectures and I've just seen that serial correlation is sometimes caused by omitted variable bias, measurement error etc. If it's the case, then it seems omitted variable bias not just lead to biased estimator but also inefficient estimator at the same time.

How should I understand this lecture? could anybody please explain this?

Best Answer

I am answering under the supervision of CV's peers. Be critical.

Assume one has the following model specification

$\boldsymbol{y} = \boldsymbol{X}\boldsymbol{\beta} + \boldsymbol{u}$

where $\boldsymbol{y}$ is a $n \times 1$ vector, $\boldsymbol{X}$ an $n \times k$ matrix, $\boldsymbol{\beta}$ a $k \times 1$ hyperparameter and $\boldsymbol{u}$ a $n \times 1$ vector of homoscedastic but autocorrelated residuals. At this stage we still do not know how those are autocorrelated.

Assume that one omited to include another variable, say a $n \times 1$ vector $\boldsymbol{z}$, whose endogeneity consists of its autocorrelation such that

$\boldsymbol{z} = f(\boldsymbol{z})$

where $f$ is assumed to be a bijective/invertible vector function which specifies the correlation structure between the $n$ components $z_{i=1,...,n}$ of $\boldsymbol{z}$.

This means that $\boldsymbol{u}$ is hiddenly generated as follows (where $\gamma \neq 0$ is a scalar parameter and $\boldsymbol{v}$ is $n \times 1$ vector of errors assumed to be iid normal.)

$\boldsymbol{u} = \gamma\boldsymbol{z} + \boldsymbol{v} \iff \frac{1}{\gamma}(\boldsymbol{u}-\boldsymbol{v}) = \boldsymbol{z} \iff f(\frac{1}{\gamma}(\boldsymbol{u}-\boldsymbol{v})) = f(\boldsymbol{z})$

But since one has $\boldsymbol{z} = f(\boldsymbol{z})$, the above last equivalence can be turned into an equality. Which leads to

$\frac{\boldsymbol{u}-\boldsymbol{v}}{\gamma} = f(\frac{\boldsymbol{u}-\boldsymbol{v}}{\gamma}) \iff u = f(\frac{\boldsymbol{u}-\boldsymbol{v}}{\gamma})\gamma+\boldsymbol{v}$

Which shows that even if the correlation is not the same as the one there is between the components of $\boldsymbol{z}$, it does exist.

Thus yes serial correlation does have something to do with endogeneity when, e.g., this endogeneity consists of an omited autocorrelated variable whose autocorrelation structure is invertible.


But actually, it is very unlikely that $f$ be invertible. I mean that, if autocorrelation works through time, $f$ is the backshift operator, and it is not invertible.

Related Question