Solved – Choosing the maximum lag length in the augmented Dickey-Fuller test

augmented-dickey-fullerlagstime seriesunit root

I have a question regarding how to choose the maximum lag length in the augmented Dickey-Fuller test using the "urca" package in R.

I want to perform the ADF test on the daily price of a stock index for 12 years. I used the AIC in the command to choose the optimal number of lags. However, the problem is, I don't know which number I should set for the maximum lag length. If I set the maximum lag length equal to 1, 75, 100, 250 and 365 respectively, the test statistic is -1.5088, -2.2627, -3.0098, -3.4081 and -3.6462 respectively. These statistics will definitely lead to different results and interpretation…

I searched and found that it is often good to set the maximum lag length as 1 for annual data, 4 for quarterly data and 12 for monthly data (no information on daily data). In this sense, could you please give me any suggestions?(I know it is really silly to use such large numbers as the maximum lag length…)

Besides, for the example above, I could try the maximum length as 365 as the data quantity is large. However, for the test of a single year, the total number of data is smaller than 250. What would I do in case the maximum length you might suggest for the first question exceeds 250?

Another question is: would be better to test the log of the stock index?

Thank you very much for your kind help!

Best Answer

This is can be a tricky one. These Zivot Notes discuss a slightly more advanced way to select lags for the ADF. That being said, it is good to remember that purpose of including lags is to control for serial correlation. Consequently, you'll want to examine your error to assure that no serial correlation is present. Even a good model fit (i.e. low IC) does not ensure the absence of serial correlation.

It is important to remember that it is essential to include the lowest number of lags possible. Including erroneous lags will greatly diminish the test's power. This is especially problematic because it is well known that ADF test have low power, especially for near unit root processes.

Additionally, you should consider other tests for stochastic trends like PP or KPSS.

Finally, never loss prospective of the bigger picture. Stock prices, especially at a daily frequency, almost always follow a stochastic trend. If they did not follow a random walk, then you could forecast the prices with a relatively high level of certainty (i.e. the prediction interval for stochastic trends explode). If this was the case then you could easily forecast stock prices and make billions of dollars. But stock markets are efficient, really efficient and there is not billions of dollars laying around to be picked up. At least that is what Eugene Fama says.