Solved – The between estimator in panel data

panel data

When you model the between estimator of you panel data, you regress the averages of the explanatory variables of the subjects against the averages of the outcome variables of the subjects.

But in this regression model, do you have to include an intercept?

Out of our textbook:

The between estimator exploits the cross-sectional dimension
(differences between units) of the data by regressing the individual
averages of y on the individual averages of x and a constant using OLS

I have played around with the example data of the book of Gujarati: Basic in econometrics, chapter 16. You can find the data here: http://shazam.econ.ubc.ca/student/gujarati/table15.4

This is the plot I made (the colors represent 3 different companies):

On the plot you can see the Pooled OLS regression data and the fixed effects (aka within estimator) regression data. How would the between estimator look like? Does it have one intercept or three? And what if all the data points of the companies lie exactly above each other, so they have the same average x?

Fixed effects plus pooled OLS

Best Answer

As you said correctly, the between estimator takes the individual effects model $$y_{it} = \alpha_i + x'_{it}\beta + \epsilon_{it}$$ and averages out the time component resulting in the regression $$\overline{y}_{i.} = \alpha + \overline{x}'_{i.} + (\alpha_i - \alpha + \overline{\epsilon}_{i.})$$ where bars indicate average variables and . signifies that time has been averaged out. You still need an intercept in this model to consistently estimate it.

Note though that this estimator only uses the cross-sectional information and completely discards the time variation in your data. The estimator is only consistent if $\alpha_i$ are random effects (though in this case you may opt for the random effects estimator which is more efficient and also uses the time variation in the data).

You can easily implement the between estimator in your statistical software by averaging the data for each panel unit to average out the time component and then regress the averaged variables on each other. For more information on this topic see for instance Cameron and Trivedi (2009) "Microeconometrics using Stata" or Wooldridge (2010) "Econometric Analysis of Cross-Section and Panel Data".

Related Question