Solved – Fixed Effects Gravity Model for forecasting..with time constant variables Stata

fixed-effects-modelpanel datapoisson-regressionstata

I have a large panel data set of global trade flows and I would like to create a model for forecasting between two specific country pairs. I am just having trouble figuring out how to go about it in Stata using fixed effects and yet still accounting for the individual effects.

I found this research paper and would essentially like to recreate the process explained on p.299: http://ageconsearch.umn.edu/bitstream/43996/2/martinez.pdf

A problem we faced with FEM is that we cannot directly estimate
variables that do not change over time because the inherent
transformation wipes out such variables. However, these variables can
be easily estimated in a second step, running another regression with
the individual effects as the dependent variable and distance and
dummies as explanatory variables

I can easily estimate xtreg with trade volume regressed on gdp/gdp per capita, but what are the steps I then take to estimate the individual effects (dist, island, etc) in Stata?

Note: I assume they are using FE OLS, but because my data contains many 0 trade flows, is it possible to apply this method to an xtpoisson instead?

Best Answer

What they do in the paper is that they estimate their gravity model, say equation 5.2, using the fixed effects estimator and they estimate the fixed effects directly to use them later in equation 6. You can do this with the predict command after xtreg. In Stata this would be:

xtreg IX lYi lYj lNi lNj lD  lIi lIj Pij1 - Pijh
predict IE, u

In the fixed effects regression all the time-invariant variables drop out as the authors stated. The predict command then gives you the individual effects $\text{IE}$ which they use in equation 6.

With regards to your note I'm not sure if the same procedure applies to xtpoisson given that the interpretation of the estimated fixed effects changes. For this have a look at a similar question on the Statalist with the corresponding answer by Maarten Buis. He is also active on CV so if you're lucky he can provide you with guidance on this. Otherwise I would guess that Martinez-Zarzoso and Nowak-Lehmann had the same problem with the many zeros (I suppose their data is similar to yours given the similarity of the application) and yet the had their reasons to stick to linear models.
I hope this helps.

Related Question