Solved – Serial correlation in financial returns (S&P500) in large samples

autocorrelationdiagnosticfinancetime series

I have a daily stock returns series and a squared returns series for the S&P500.

The Ljung-Box Q-Statistic for the squared returns series says there is autocorrelation and therefore ARCH effects (p-value of the Q-statistic is zero at all lags, and there are significant spikes in the ACF and PACF). My understanding is that this is correct and in line with what I should be expecting.

However the daily returns series Q-statistic and p-value are all less tha 0.05 and very significant (p-value is 0 at all lags) for the full sample. Whereas for the In-sample period, the Q-statistic and corresponding p-values are all greater than 0.05, hence implying no serial correlation. There is a contradiction. My understanding is that there should be no serial correlation or minor serial correlation in daily returns as per Tsay.

Hence what is the reason for this observed serial correlation in the full sample. Is it because of the large sample and should I be worried? For the purposes of modelling volatility for forecasting purposes, should I ignore the serial correlation in the full sample and just focus on the fact that the In-sample period does not have serial correlation?

Note: Full sample contains 5,218 observations. In-sample contains 2380 observations.

Best Answer

First of all: Slight Autocorrelation is not unusual for (non-squared) stock return data in my experience. Otherwise there would be no point in trying to decide, whether there is a positive/negative trend or not, from the perspective of investors.

How big is the difference between both $p$-values? Which lag-orders did you choose for the test? The differing values might be from the differing power of the test for the different sample sizes, combined with a weak trend in the in-sample period.

In practice, you may try to include an AR model for the mean series. If the AR coefficients aren't significant (which is to be expected), I'd suggest to just use a fixed mean for the daily return data and model the volatility with some GARCH model that can handle the leverage effect of stock returns (EGARCH, Beta-t-EGARCH).

Related Question