Solved – Time series with autoregressive distributed lags: Forecasting for future

ardlarimaforecastingrtime series

I have daily data from last 2 years.

I want to do ARIMAX and the regressor component being autoregressive distributed lag of the same variable. Since it has impact, along with dummy variables to account for seasonality in the xreg paratemer in auto.arima function.

The challenge i am facing is predicting my predictor for future. For example, i used daily data for 2 year for model building. For forecasting into future, i also need values of lag variable, which i do not know. If i use 2 lags of daily data in the model, then in order to predict for future i will also need value of those lag variables as well. So to predict $Value$ at time $t$ i will need $Value$ at $t-1$ and $t-2$ which i have from past records. However, if i want to find value at $t+5$ then i will need to find $t+3$ and $t+4$. Not sure how to proceed in this direction. As stated earlier, i am using auto.arima function from forecast package in R .

My ultimate goal is to predict for next 365 days. What i assume to be a solution is that i predict for $t+1$ as it will require $t$ and $t-1$ as lag component which i already have. once done i can use this predicted $t+1$ component to predict for $t+2$ as i will know value of $t+1$ from previous iteration and $t$ from original values. Is it the right approach?

Best Answer

Do you intend to model $x_t$ as an ARIMAX process where the exogenous regressors are distributed lags of $x_t$? That sounds peculiar. Why not stick to either a pure ARIMA model or a pure distributed lag model for $x_t$?

Yes, iterative prediction which you suggest in the last paragraph seems a reasonable solution given your setting. It is quite commonly used in ARIMA and VAR type of models, for example.

Note that predicting 365 days ahead using iterative forecasting with ARIMA type of models may be quite disastrous; forecast errors will compound and get way out of hand. While the first few forecasts may be fine, do not expect high forecast accuracy beyond that; actually, some naive forecast (sample mean, last observed value or the like) will likely do better than iterated ARIMA for distant forecast horizons.

Note also that functions arima or auto.arima with exogenous regressors implement regression with ARMA errors rather than the ARIMAX model; see more here.

(I had some trouble reading your text, so let me know if I misunderstood something.)