ARIMAX vs Regression – Comparing ARIMAX and Regression with ARIMA Errors

rtime series

I'm struggling to fully understand this topic. I've read the blog post that is highly cited on previous Cross Validated topics: https://robjhyndman.com/hyndsight/arimax/, but I still have some questions. I would appreciate any wisdom.

1) Am I understanding "Regression With ARIMA Errors" correctly? This procedure fits the coefficients for the exogenous variables in a simple linear regression for the time series y, and then constructs an ARIMA model for the residuals of this model.

2) It seems to me that the coefficients for the exogenous variables would be poorly estimated in this approach because the variance in y that is explained by the ARIMA components is unaccounted for. Am I right about this? Why is this considered an acceptable approach?

3) If studying the effect of the exogenous variables is the primary goal of my analysis, does that mean I need a real ARIMAX model instead of Regression with ARIMA Errors?

Many thanks.

Best Answer

Assuming you are fitting the regression with ARIMA error model using arima(), Arima() or auto.arima(), the estimation is done in one step, not two as you describe. That is, the regression coefficients are estimated simultaneously with the ARMA coefficients.

If you are studying the effect of the exogenous variables, you are much better off using a regression with ARIMA errors than an ARIMAX model. In the ARIMAX model, the effect of the exogenous variables tends to get muddled up with the effect of the autoregression parts of the model as I explain in my blog post. On the other hand, the regression with ARIMA errors allows the regression coefficients to be interpreted in the usual way.

Related Question