Solved – Why is a regression model of portfolio return giving smaller adjusted R-square (i.e., negative) than expected

model selectionr-squaredregressionself-studyspss

I have a question for adjusted $R^2$ given a specific regression model.

I am doing a project on January effect and I have a model from some journal using

$$R_i = a_0 + a_1D_{\mathrm{Jan}} + \varepsilon_i$$

where

  • $R_i$ is daily return of portfolio/index,

  • $a_0$ is non-January daily returns,

  • $a_1$ is January returns over non January returns,

  • $D_{\mathrm{Jan}}$ is dummy variable (1 for Jan, 0 otherwise).

With this model I tried to test whether January return is significantly greater than non-Jan returns, especially in small capitalization stocks.

So I have returns for size sorted portfolio 1 to 4, where Portfolio 1 (P1) consist of smallest cap stocks, P4 consist of largest cap stocks.

What I did was using regression program in Excel and SPSS, input all daily returns of the portfolio from Jan to Dec for 10 years as dependent variables and the dummy (1 for january and 0 for others) as an independent variable into the regression program and I get negative adjusted $R^2$ for all portfolios (P1-P4), mostly about -0.05. The value of $R^2$ itself was also very small at about 0.0001.

The journal I am basing my model on was using monthly returns as $R_i$, but I modified it into daily return. I still get negative adjusted $R^2$ when I use the monthly returns.

Can anyone please help me point out what was wrong in the model? Did I use the wrong input based on the given model? If yes, what should be the correct input? Or if the model was wrong, what model should I use?

Here are the results of my test, the p-value of the dummy variable and the adj $R^2$.

  • For P1 0.54, adj $R^2$ -0.0002.

  • P2 0.36, adj $R^2$ 0.0004.

  • P3 0.68, Adj $R^2$ -0.0003,

  • P4 0.14, adj $R^2$ 0.0005.

I understand that the variables are all insignificant. But I am confused in interpreting the $R^2$

Best Answer

You've included an interaction term without including both of the main effects that are the components of that interaction. According to standard practice, you need a term for January returns. Exceptions to this rule are rare though they have been discussed on this site recently at Including the interaction but not the main effects in a model

Beyond that (which may no longer apply after edits to the question), many people will obtain negative adjusted rsq when trying to predict something so difficult as stock returns. The rsq itself is so tiny that when the model gets penalized for its number of predictors (k), the resulting adjusted rsq will quite often go negative. This is especially true if the sample size is small, since N is, along with k, part of what determines the adjustment. Adjusted rsq = 1 - [(1 - rsq)(n - 1)] / (n - k - 1)