I've been performing a manual 2sls regression and I've come with the following results and that I find a bit suspicious.
I've done the F-test of the first-stage regression and I've obtained a score of 31.25 (F(IVs,n-k)=31.25) and I've performed the rank test of the first-stage regression with the following code in Stata:
ranktest (endog_var)(Z1 Z2 exogen_var), full robust
In which "endog_var" is the endogenous variable, "Z1" and "Z2" are the instruments and "exogen_var" is a set of exogenous variables. And I've obtained a p-value of 0.17, i.e. the hypothesis that the matrix is not full rank is not rejected. Is that possible? Am I doing something wrong? Should I partial out the exogenous variables?
Best Answer
Yes, you need to partial out the exogenous variables using the
partial
option inranktest
. So the correct syntax should be:This is also done in the documentation for
ranktest
(at the bottom of the helpfile). You can check this by comparing your results to the Kleibergen-Paap rk reported byivreg2
with robust standard errors.As an example:
You see that test 1 produced the correct rk statistic and p-value (as in the
ivreg2
output) whilst test 2 did not come up with the correct results.And yes, the matrix may not be of full rank if your instruments are not strong enough.
ivreg2
also provides the F-test for the excluded instruments (see the Angrist and Pischke F-statistic). For more information see section 7 of Baum et al (2007) "Enhanced routines for instrumental variables / generalized method of moments estimation and testing" (link). Usingivreg2
is generally a better strategy than doing 2sls "by hand" because the Stata routine provides you with a whole range of test statistics that are useful and it also provides you with the correct standard errors.