When preforming Johansen Cointegration test for 2 time series (the simple case) you need to decide the lag you want to use. Doing the test for different lags return different results: for some lag levels the null hypothesis can be rejected but for others it can't.

My question is what is the right method based on the input data to decide what lag I need to use when preforming the Johansen Test?

p.s. I submitted this question to quant.stackexchange but some suggested it is better fit to this group.

## Best Answer

You are correct. The weakness of Johansen approach is that it is sensitive to the lag length. So, the lag length should be determined in a systematic manner. Following is the normal process used in the literature.

a. Choose maximum lag length "m" for VAR model. Usually, for annual data this is set to 1, for quarterly data this is set to 4, and for monthly data this is set to 12.

b. Run the VAR model in level. For example, if the data is monthly, run the VAR model for lag lengths 1,2, 3,....12.

c. Find the AIC (Akaike information criterion) and SIC (Schwarz information criterion) [ there are also other criteria such as HQ (Hannan-Quin information criterion), FPE (Final prediction error criterion) but AIC and SIC are mostly used) for the VAR model for each lag length. Choose the lag length that minimizes AIC and SIC for the VAR model. Note that SIC and AIC may give conflicting results.

d. Finally, you MUST confirm that for the lag length you selected in step c, the residuals of the VAR model are not correlated [use Portmanteau Tests for autocorrelations]. You may have to modify the lag length, if there is the autocorrelation. Usually, beginners in time series econometrics tend to skip step d.

e. For the cointegration, the lag length is the lag length chosen from step d minus one (since we are running the model in first difference now, unlike in level when we used VAR to decide the lag length).