Solved – Check for endogeneity

endogeneitylog-linear

I run a log linear model, as $log(y) = b_0 + b_1x_1 + b_2x_2 + e$. I think $x_1$ may be endogenous and I would like to test it, so that I can consequently run a two-stage model. I would like to know if I there is a way to check it or if I must have at least one instrumental variable for $x_1$.

Another specification of my model is $log(y) = b_0 + b_1log(x_1) + b_2x_2 + e$. May I use the same procedure with a logarithm or is there a difference?

Best Answer

In general, endogeneity is a theoretical property and not something that can be tested from the data at hand. Then you need something as an instrument, like you say.

The second question sounds more like you are wondering what functional form will be best. There will certainly be a difference in the parameter values, but it may be that the predictions from the two are the same. You can run both, predict and inspect visually:

You could for example estimate model 1 first and compute $\widehat{\log y_1}$ as the predicted values from the first model and $\widehat{\log y_2}$ as the predicted values from the second. Then you can plot them against each other.

Stata code could be

reg logy x1 x2
predict yhat1 , xb
g logx1 = log(x1)
reg logy logx1 x2
predict yhat2 , xb 
twoway (scatter logy x1) (scatter yhat1 x1) (scatter yhat2 x1) , legend(order(1 "data" 2 "linear" 3 "logarithmic"))