(1) If I believe my instrument is exogenous conditional upon a few exogenous variables, do I include them only in the first stage? I.e. would the command be:
ivregress 2sls Y (X= inst1 inst 2 exog1 exog2) exog3 exog4
Where Y
is the dependent variable, X
is an endogenous variable, inst1
and 2
are the instruments for X
, exog1
and exog2
are what make the instrument exogenous to the error in the second stage, and exog3
and exog4
are just second stage variables that enhance precision? Would I need to include exog1
and exog2
outside the parenthesis also?
(2) If I want to interact endogenous variable X
with exog3
, how should I instrument X
and interact it? As follows? ivregress 2sls Y (X*exog3= inst1 inst 2 exog1 exog2) exog4
Best Answer
You need to include all your exogenous variables in both the first and the second stage as otherwise you might end up with biased estimates. For a discussion of why having some exogenous variables in the first but not in the second stage is problematic see here. Given your setup the correct syntax for Stata would be
ivregress 2sls Y exog1 exog2 exog3 exog4 (X = inst1 inst2)
As a side note: instead of
ivregress
you might want to useivreg2
which is a user written command that provides many more diagnostic statistics for your 2SLS model.For the interaction of the endogenous variable and
exog3
you would also need to generate an interaction between the instruments andexog3
. In a model like $$Y_i = \alpha + \beta_1 \text{exog1}_i + \beta_2 \text{exog2}_i + \beta_3 \text{exog3}_i + \beta_4 \text{exog4}_i + \gamma X_i + \epsilon_i$$ you said that you can instrument $X$ by running the first stage $$X_i = a + \rho_1 \text{exog1}_i + \rho_2 \text{exog2}_i + \rho_3 \text{exog3}_i + \rho_4 \text{exog4}_i + \phi_1 \text{inst1}_i + \phi_2 \text{inst2}_i + e_i $$ and then use the fitted values of this in the second stage. In the same spirit, ifinst1
andinst2
are valid instruments forX
, theninst1*exog3
andinst2*exog3
will be valid instruments forX*exog3
, i.e. for a model $$Y_i = \alpha + \beta_1 \text{exog1}_i + \beta_2 \text{exog2}_i + \beta_3 \text{exog3}_i + \beta_4 \text{exog4}_i + \gamma \text{(X$_i$ $\cdot$ exog3$_i$)} + \eta_i$$ the first stage would be $ \begin{align} \text{(X$_i$ $\cdot$ exog3$_i$)} &= c + \delta_1 \text{exog1}_i + \delta_2 \text{exog2}_i + \delta_3 \text{exog3}_i + \delta_4 \text{exog4}_i + \psi_1 \text{(inst1 $\cdot$ exog3)}_i \newline &+ \psi_2 \text{(inst2 $\cdot$ exog3)}_i + u_i \end{align} $In Stata the least complicated way would be to generate the interactions by hand
This type of question has been asked before on the Statalist, so if you are interested in further discussion of the problem have a look here.