Time-Series ARIMA – How to Assess Normality of Residuals for Impulse Response Function Analysis

arimaimpulse responsenormality-assumptionresidualstime series

I know the issue of normality of residuals has been discussed here quite a lot, and I've learned that there are some cases in which it can be a less important hypothesis to test, while more critical in others (see for example here and especially here – discussion of the ARIMA case in the comments to the OP). Anyhow, non of the posts I read dealt specifically with the meaning of normality of residuals on the impulse response function from ARIMA type models. Thus I'm still confused about the implications of non normality (rejecting Jarque-Berra H0) in this case.

  1. ARIMA/VAR models assume residuals to be white noise. meaning: zero mean, finite variance and no serial correlation. this can be checked with a Portmanteau test for example. But do the residuals have to be gaussian also? there is a similar question here, but the answer doesn't refer to the normality part.
  2. AFAIK, if W.N is also gaussian, it means that the residuals are i.i.d. I've read somewhere that i.i.d residuals are required for IRF in the case of VAR. As VAR is very much based on ARIMA, I'm inclined towards understanding that this is true for univariate ARIMA too. But since using IRF in ARIMA isn't so common, I couldn't find any definitive answer.

it would be great if anyone here could clarify this theoretically.

Practically, the reason I'm asking is that I'm banging my head trying to fit ARIMA to a bunch of time series, in order to be able find it's underlying dynamics using IRF. I've met all the other assumptions and the fit was reasonable, but I couldn't get the residuals to be normal even with differencing/power transformations and also considering level shifts etc. using tsoutliers::tso() in R. of the data. So I don't know whether I should trust any IRF results.

The data I have is a daily time series (N = 335) of news media coverage of different topics. I assume the reason for non-normality is the fact that this data can have lots of zeros (no coverage of a topic on a given day) and also a small amount of ones (which means a topic was salient on that day) which makes a heavy right tail. treating ones as outliers or replacing zeros with some other value (MA for instance) would miss the whole point.

So what I'd really like to clarify at this point is whether ARIMA and IRF can be used without normality, and if not – what else should I try? In the latter case, I probably should post another question with some data so that "what else should I try" could be answered properly, but any general pointers would be appreciated as well.

Best Answer

As @RichardHardy points out, normality of the errors is not necessary for estimated IRFs. The IRFs are functions of the estimate of the variance-covariance matrix of the residuals and of the VAR slope coefficients, both of which can be shown to follow asymptotic normal distributions under weaker conditions than normality (e.g., finite fourth moments and symmetric error distributions).

Hence, by the delta method, one may establish that the IRF estimates also are asymptotically normal.

That said, if the errors are normal, the asymptotic variance-covariance matrix of the impulse response estimator simplifies. Depending on the software you use, it may be the case that this simplification is assumed, so that it would also be necessary that normality is satisfied for inference (e.g., confidence bands) to be asymptotically valid.

Detailed discussion of these issues is for example provided in Lütkepohl, New Introduction To Multiple Time Series Analysis. See in particular Proposition 3.6 and the subsequent Remark 4.

Related Question