Solved – Can adjusted-$R^2$ compare models across different samples

r-squaredregression

I was reading this journal article about the decrease in the relevance of the income statement:

Basically what the author does is this:

(1) For each year, he takes that year's (say, 2010) income statement information from a set of companies as the $X$ values (independent variable), and the 1-year return (2011 share price/2010 share price) of the companies' stocks as the $Y$ values (dependent variable). He generates a regression model and calculates the adjusted $R^2$ for this model.

(2) He does the same in part (1) for 30 years

(3) He plots a graph of the adjusted $R^2$ for each year over time, and generates a regression model for it. It turns out that adjusted $R^2$ has decreased by about 10% to 7%. Thus, he concludes that the income statement has lost relevance over time.

I've seen adjusted $R^2$ used to compare different models using the same sample data. But I've never seen it used in such a manner before because the data used to generate each model is different.

Any advice would be much appreciated.

Best Answer

From a quick reading of the paper you mention, the author makes an unnecessary methodological mistake.
$R^2$-adjusted is NOT a measure of fit ("fit" which the author, not unjustifiably, maps conceptually to "explanatory relevance"). Theil has proposed this metric in order to evaluate alternative sets of regressors while penalizing the inflation in the number of them used, on the same data set, as you point out. The way the metric is constructed it cannot be meaningfully interpreted as showing the explanatory power of the regressors.

But I believe the author had no need to use $R^2$-adjusted, because the regressors are very few in the various specifications he implements. He could use simple $R^2$ and most probably he would have arrived at the same conclusions, which would then be methodologically valid because $R^2$ is indeed a measure of fit, and no issues are created to compare how $R^2$ performs, for the same regressors, over different data sets, or over time.

You could communicate with the author on the matter, ask why he used $R^2$-adjusted instead of simple $R^2$ -it is always good when papers generate a discussion.