Econometrics – Understanding the Heckman Selection Model and Negative Rho

econometricsheckmanself-study

Following this post Bivariate probit model with sample selection, what would be an example of negative correlation between errors of selection and outcome equation?

I have also found an example where rho is negative: http://personal.rhul.ac.uk/uhte/006/ec5040/selectivity.pdf. Based on this example, unobserved factors or positive errors that make participation more likely (e.g. ability) tend to be associated with lower wages or negative errors of outcome equation. What could unobservables in outcome equation represent?

Best Answer

Here's a toy example with negative correlation in the errors in the two equations.

Let the outcome $y=\alpha + \beta \cdot x + \varepsilon$ represent immigrant earnings, where $x$ is years of schooling. The participation equation $p=\gamma + \delta \cdot x + u$ is the decision to emigrate. We only observe $y$ if $p \ge 0$. There's an unobserved non-cognitive trait called restlessness that makes emigration more likely, we will represent by $u$. But restlessness can also lower wages, so it is part of the error in the earnings equation, but it enters it negatively, so $u$ and $\varepsilon$ are negatively correlated.

Let's suppose that $\delta>0$, so education makes people more likely to leave their home. Folks with low $x$ will only emigrate if they have a high $u$, but people with high $x$ will emigrate with almost any value of $u$. This means that the estimate of the effect of schooling on immigrant income will be too steep since the low education immigrants in our earnings data will be overwhelmingly more restless.