Solved – How many lags should I include in a VAR model

lagstime seriesvector-autoregression

When building a VAR-model with six variables and 117 observation, I had the following situation: after building a VAR(1), the overall portmanteau test says that the residuals are OK ($p=0.85$, $p_{adjusted}=0.22$). But when I have a look at the single residuals the ACFs all look white noise except one of the six in my case. For this only one it seems like I need a VAR(2) — and portmanteau test for this single residual shows: $p=0.067$, $p_{adjusted}=0.029$.

I'm unsure what to do in this situation. I experienced the same situation in another VAR where one residual looked like including 5 lags.

So the first question is, how do I decide if 1 lag is enough or if I need 2 lags?

Next: If I decide that it takes 2 lags, do I then have to let the model estimate the whole matrix even if there is only one variable that needs the second lag? Or is it reasonable to allow for the matrix of the second lags only the specific row (and/or column) belonging to this variable? Like in the picture below – imagine only variable E shows elevated autocorrelation at lag 2; the picture is only for showing the problem, normally there is a constant included and deterministic trend terms, too.

enter image description here

What to do in the case of one residual showing only lag 5 to include additionally? Then I would not include the matrices for lags 2, 3 and 4 and only for 5?


What goes in the same direction: if I first include only intercepts and deterministic trends and see that some residuals are already WN, then does it make sense to include any predictors for them? Should I include higher lags only for the variables where the residuals show it?

NOTE: $p$ and $p_{adjusted}$ are calculated as follows:

enter image description here

Best Answer

Regarding the first question, different equations of a VAR model need not have the same lag order. Each equation is meaningful by itself and can be treated separately (as regards estimation). If you find that one of the equations may benefit from including some more regressors, you may as well do that.

Regarding the picture, I can understand why you have one full row in the lag 2 matrix, but why do you also have one full column? Based on what you have told, that seems unnecessary.

Regarding lag 5, is it plausible that there could be an effect with lag 5? (This is a subject-matter question.) If yes, then consider including just lag 5; including all the lags in between 1 and 5 would not be a parsimonious solution. And you should care about parsimony since your sample is quite small. If lag 5 is quite implausible, maybe the significant autocorrelation at that lag is a false positive that is due to chance?

Keep in mind that trying to fit the data very well may lead to overfitting. Using information criteria such as AIC or BIC could help decide between a few sensible candidate models. That means that you would deliberately accept ill-behaved model errors when including extra parameters is too costly due to increased estimation uncertainty. That should give some overall guidance as well as address the questions in the last paragraph.

Related Question