I was told that it's possible to run a two-stage IV regression where the first stage is a probit and the second stage is an OLS. Is it possible to use 2SLS if the first stage is a probit but the second stage is a probit/poisson model?
Probit Two-Stage Least Squares (2SLS) – Overview and Application
2slsbinary datainstrumental-variablesprobit
Related Solutions
There has been a similar question regarding a probit first stage and an OLS second stage. In the answer I have provided a link to notes that contain a formal proof of the inconsistency of this regression which is formally known as "forbidden regression", as it was termed by Jerry Hausman. The main reason for the inconsistency of the probit first stage/OLS second stage approach is that neither the expectations operator nor the linear projections operator pass through a non-linear first stage. Therefore the fitted values from a first stage probit are only uncorrelated with the second stage error term under very restrictive assumptions that almost never hold in practice. Be aware though that the formal proof of the inconsistency of the forbidden regression is quite elaborate, if I remember correctly.
If you have a model $$Y_i = \alpha + \beta X_i + \epsilon_i$$ where $Y_i$ is a continuous outcomes and $X_i$ is a binary endogenous variable, you can run the first stage $$X_i = a + Z'_i\pi + \eta_i$$ via OLS and use the fitted values $\widehat{X}_i$ instead of $X_i$ in the second stage. This is the linear probability model you were referring to. Given that there is no problem for expectations or linear projections for this linear first stage, your 2SLS estimates will be consistent albeit less efficient than they could be if we were to take into account the non-linear nature of $X_i$.
Consistency of this approach stems from the fact that whilst a non-linear model may fit the conditional expectations function more closely for limited dependent variables this does not matter much if you are interested in the marginal effect. In the linear probability model the coefficients themselves are marginal effects evaluated at the mean, so if the marginal effect at the mean is what you are after (and usually people are) then this is what you want given the the linear model gives the best linear approximations to non-linear conditional expectation functions.
The same holds true if $Y_i$ is binary, too.
For a more detailed discussion of this have a look at Kit Baum's excellent lecture notes on this topic. From slide 7 he discusses the use of the linear probability model in the 2SLS context.
Finally, if you really want to use probit because you want more efficient estimates then there is another way which is also mentioned in Wooldridge (2010) "Econometric Analysis of Cross Section and Panel Data". The above linked answer includes it, I repeat it here for completeness. As an applied example see Adams et al. (2009) who use a three-step procedure that goes as follows:
- use probit to regress the endogenous variable on the instrument(s) and exogenous variables
- use the predicted values from the previous step in an OLS first stage together with the exogenous (but without the instrumental) variables
- do the second stage as usual
This procedure does not fall for the forbidden regression problem but potentially delivers more efficient estimates of your parameter of interest.
Your case is less problematic than the other way round. The expectations and linear projections operators go through a linear first stage (e.g. OLS) but not not through non-linear ones like probit or logit. Therefore it's not a problem if you first regress your continous endogenous variable $X$ on your instrument(s) $Z$, $$X_i = a + Z'_i\pi + \eta_i$$ and then use the fitted values in a probit second stage to estimate $$\text{Pr}(Y_i=1|\widehat{X}_i) = \text{Pr}(\beta\widehat{X}_i + \epsilon_i > 0)$$
The standard errors won't be right because $\widehat{X}_i$ is not a random variable but an estimated quantity. You can correct this by bootstrapping both first and second stage together. In Stata this would be something like
// use a toy data set as example
webuse nlswork
// set up the program including 1st and 2nd stage
program my2sls
reg grade age race tenure
predict grade_hat, xb
probit union grade_hat age race
drop grade_hat
end
// obtain bootstrapped standard errors
bootstrap, reps(100): my2sls
In this example we want to estimate the effect of years of education on the probability of being in a labor union. Given that years of education are likely to be endogenous, we instrument it with years of tenure in the first stage. Of course, this doesn't make any sense from the point of interpretation but it illustrates the code.
Just make sure that you use the same exogenous control variables in both first and second stage. In the above example those are age, race
whereas the (non-sensical) instrument tenure
is only there in the first stage.
Best Answer
What was proposed to you is sometimes referred to as a forbidden regression and in general you will not consistently estimate the relationship of interest. Forbidden regressions produce consistent estimates only under very restrictive assumptions which rarely hold in practice (see for instance Wooldridge (2010) "Econometric Analysis of Cross Section an Panel Data", p. 265-268).
The problem is that neither the conditional expectations operator nor the linear projection carry through nonlinear functions. For this reason only an OLS regression in the first stage is guaranteed to produce fitted values that are uncorrelated with the residuals. A proof for this can be found in Greene (2008) "Econometric Analysis" or, if you want a more detailed (but also more technical) proof, you can have a look at the notes by Jean-Louis Arcand on p. 47 to 52.
For the same reason as in the forbidden regression this seemingly obvious two-step procedure of mimicking 2SLS with probit will not produce consistent estimates. This is again because expectations and linear projections do not carry over through nonlinear functions. Wooldridge (2010) in section 15.7.3 on page 594 provides a detailed explanation for this. He also explains the proper procedure of estimating probit models with a binary endogenous variable. The correct approach is to use maximum likelihood but doing this by hand is not exactly trivial. Therefore it is preferable if you have access to some statistical software which has a ready-canned package for this. For example, the Stata command would be
ivprobit
(see the Stata manual for this command which also explains the maximum likelihood approach).If you require references for the theory behind probit with instrumental variables see for instance:
Finally, combining different estimation methods in the first and second stages is difficult unless there exists a theoretical foundation which justifies their use. This is not to say that it is not feasible though. For instance, Adams et al. (2009) use a three-step procedure where they have a probit "first stage" and an OLS second stage without falling for the forbidden regression problem. Their general approach is:
A similar procedure was employed by a user on the Statalist who wanted to use a Tobit first-stage and a Poisson second stage (see here). The same fix should be feasible for your estimation problem.