Solved – How to fit ARMA(4,4) model, but some coefficients set to zero / fixed

armar

Suppose I have the following ACF and PACF (data:
ap
I want to fit an ARMA-GARCH process. Currently I want to do the first step, specify the mean equation. The first model just uses a constant $\mu$, so no ARMA. In the second model I was thinking about a modified ARMA(1,1) or ARMA(4,4), I don't know what this is called. I want to only use the 4th lag order in the AR and MA part. So this is basically an ARMA(4,4) where the coefficients of the first three lags are set to zero.
$r_t=\delta + \epsilon_t + \alpha_4 r_{t-4}+ b_4 \epsilon_{t-4}$

How can I fit this model in R?

I tried

   arima(logloss, order=c(4,0,4),fixed=c(0,0,0,NA,0,0,0,NA,NA))

First of all: Is this correct?

Second: Does this make sense?

My output is the following:

ts

If I calculate the p-values via

# p-values
(1-pnorm(abs(aa$coef)/sqrt(diag(aa$var.coef))))/2

I get

> (1-pnorm(abs(aa$coef)/sqrt(diag(aa$var.coef))))/2
         ar1          ar2          ar3          ar4          ma1          ma2 
2.500000e-01 2.500000e-01 2.500000e-01 4.431378e-08 2.500000e-01 2.500000e-01 
         ma3          ma4    intercept 
2.500000e-01 2.523225e-06 1.886732e-01 
> 

So can I say, that the both coefficients of the 4th lag order are highly significant, but the intercept is not significant, correct? So should I also fix it to zero?

If I just fit a model with a mean, so no AR or MA, I get:
ts2

So the mean is also not significant. What should I do? Fit a GARCH without a mean equation? So no mean, no AR or MA part?

EDIT: I played around with it and I found, that an ARIMA(5,0,5) with the first 3 lags fixed to zero and the mean fixed to zero seems to be approrpriate. The output is:
ts4
The AIC is smaller than in case of the ARIMA(4,0,4) with mean fixed to zero and the residuals look ok. Are my model building steps correct?

Best Answer

(1) Have you correctly fitted an ARIMA with some coefficients forced to zero? - Yes. But did you heed the Warnmeldung? If you didn't, & want to make sure your model is causal & invertible, check the roots of the AR & MA polynomials (abs(polyroot(your.polynomial)) is convenient in R).

(2) Is your model-building approach sensible? - I don't think it's the most sensible. Setting various coefficients to zero just because they're "not significant" is no more principled in ARIMA than in multiple regression - see the 'model selection' & 'variable selection' tags - & can lead to the same sorts of problems. The estimate for one parameter depends on all the others that are in the model, so by removing a whole bunch at once you can badly degrade the model. Stepwise selection is a little better, but after doing lots of tests you can't really justify the result overall.

Approaches vary, & depend on the goals of modelling, but I would suggest confining your attention to a smallish set of plausible models (is it really likely that an observation's affected by what happened four & five observations ago but by nothing in the intervening time?), picking a model using AIC, & validating it by out-of-sample forecasts.