Regression – How to Understand Differences Between Aggregate and Firm-Level Regression Coefficients

regression

I recently came across a study that finds the following two results:

  • On the firm-level, the independent variable $X_{it}$ has a positive impact on the dependent variable $Y_{it}$. Concretely, the study shows that the time-series regression coefficient of realized returns on earnings changes (deflated by stock prices) is positive for almost all firms $i$. That is, they run a time-series regression for each firm and look at the distribution of the betas. Both the mean and the median are positive, only 10.7% have a negative beta.
  • On the aggregate level, they find the inverse relation, i.e. aggregate earnings changes (either value- or equally weighted) have a negative beta on aggregated stock returns.

I am rather surprised by this finding. I tried to search for it online and it first seemed my understanding problem could be solved with the ecological fallacy (see Wikipedia). However, I don't think so anymore.

If I am correct, the mechanism behind the ecological fallacy is a different one. For instance, take the literacy-immigration example from Wikipedia: within each group/state, illiteracy is higher for immigrants, but since immigrants settle in states with higher literacy, on aggregate, there is a negative effect between percentage of immigration and illiteracy. So the effect, if I understand it correctly, occurs because the groups, in this case the states, are different to start with and the immigrants can choose the states. Let's assume that immigrants are randomly sampled to a state. Then this effect wouldn't work, right?

However, in my example, there is no sample selection, at least non that I am aware of since the groups are the time periods. That is, each time period all firms in a sample are aggregated and firms can't really choose the time period.

Of course, it could be that firms are bankrupt in bad states of the economy, so the sample size varies between different periods. But let's ignore that for a second and assume that the number of firms stays constant through the whole sample: How can it be that the regression coefficient is so different on the aggregate in comparison to the distribution of the regression coefficients on the firm level? Both a formal answer and an intuitive one (maybe a small example) would be great.

Best Answer

As pointed out by the commenters, this seems to be an instance of Simpson's paradox. The difference between the relationships at the aggregate and within-group level can really be as large as you want it to be (no correlation at one level, strong positive or negative correlation at the other, etc.)

I am not sure about the mechanism that could account for it in your case but I think this is easiest to understand by looking at some plots. I don't have time to create a fictional example tailored to your situation and I obviously don't have your data at hand but here is one I created some time ago:

Simpson paradox

As I said, this plot was generated in another context but let's imagine that points of the same color/shape represent measures from the same firm and the two variables are some sort of interesting quantitative characteristics. Within each firm, there is basically no relationship. At the aggregate level there is a perfect positive correlation. Pooling all the data and ignoring the structure of the data set, there seems to be a high positive correlation, driven by the aggregate level correlation and dampened by the (smaller) within-firm variance. Roughly, one interpretation for this example would be that the evolution of both variables are not related but that different firms have a different, stable, baseline level on each and that those are related.

Since one of your variables seems to be a difference and the other a ratio, providing an intuitive interpretation is more difficult but graphically and mathematically pretty much everything is possible, the correlation and regression coefficients at both level are just two different things. I think that this is the key insight behind Simpson's paradox and the examples with dichotomous variables and the interpretations that go with it are just special cases.

Related Question