Solved – F-test and Rank test for underidentification

2slseconometricsmatrixstata

I've been performing a manual 2sls regression and I've come with the following results and that I find a bit suspicious.

I've done the F-test of the first-stage regression and I've obtained a score of 31.25 (F(IVs,n-k)=31.25) and I've performed the rank test of the first-stage regression with the following code in Stata:

ranktest (endog_var)(Z1 Z2 exogen_var), full robust

In which "endog_var" is the endogenous variable, "Z1" and "Z2" are the instruments and "exogen_var" is a set of exogenous variables. And I've obtained a p-value of 0.17, i.e. the hypothesis that the matrix is not full rank is not rejected. Is that possible? Am I doing something wrong? Should I partial out the exogenous variables?

Best Answer

Yes, you need to partial out the exogenous variables using the partial option in ranktest. So the correct syntax should be:

ranktest (endog_var)(Z1 Z2), partial(exogen_var) full robust

This is also done in the documentation for ranktest (at the bottom of the helpfile). You can check this by comparing your results to the Kleibergen-Paap rk reported by ivreg2 with robust standard errors.

As an example:

// use a toy data set
sysuse auto

// run the iv regression with two instruments using ivreg2
ivreg2 price weight (mpg = foreign trunk ), first robust

/* this is the output from the first stage diagnostics for underidentification
Underidentification test
Ho: matrix of reduced form coefficients has rank=K1-1 (underidentified)
Ha: matrix has rank=K1 (identified)
Kleibergen-Paap rk LM statistic          Chi-sq(2)=1.90     P-val=0.3863
*/

// Test 1
// compare the Kleibergen-Paap rk LM test from ivreg2 with the manual test (partial out exogenous variables)
ranktest (mpg) (foreign trunk), partial(weight) full robust

*/ output
Kleibergen-Paap rk LM test of rank of matrix
  Test statistic robust to heteroskedasticity
Test of rank=  0  rk=    1.90  Chi-sq(  2) pvalue=0.386287
*/

// Test 2
// compare the Kleibergen-Paap rk LM test from ivreg2 with the manual test (not partialling out exogenous variables)
ranktest (mpg) (foreign trunk weight), full robust

*/ output
Kleibergen-Paap rk LM test of rank of matrix
  Test statistic robust to heteroskedasticity
Test of rank=  0  rk=   30.70  Chi-sq(  3) pvalue=0.000001
*/

You see that test 1 produced the correct rk statistic and p-value (as in the ivreg2 output) whilst test 2 did not come up with the correct results.

And yes, the matrix may not be of full rank if your instruments are not strong enough. ivreg2 also provides the F-test for the excluded instruments (see the Angrist and Pischke F-statistic). For more information see section 7 of Baum et al (2007) "Enhanced routines for instrumental variables / generalized method of moments estimation and testing" (link). Using ivreg2 is generally a better strategy than doing 2sls "by hand" because the Stata routine provides you with a whole range of test statistics that are useful and it also provides you with the correct standard errors.