Solved – Relation linear regression and ARMA models

armalinear modeltime series

I have data from a time series which I am currently fitting with a linear model. For that Im using the data as cross-sectional data, where each response corresponds to the value of each variable on the next time point.

Parameter estimates for different sized models through OLS (and regularized variants) are pretty good. The problem is that Im neglecting the fact that Im using time series data and have thus correlated errors and nonstationarity of the variables. This appears on a residual plot, showing that the errors follow patterns, which makes the calculation of the residual variance and standard errors of estimates very untrustworthy.

So I was looking for a way to incorporate correlated residuals and possibly the problem of moving averages of the variables into the model.

Im not very familiar with ARMA models and its variants. Is there a standard procedure to stay with the linear model and incorporate ARMA to find the coefficients by means of OLS?

Best Answer

Regression with ARMA errors may be an option (not necessarily the option). It allows you to roughly maintain the interpretation of the usual regression model but also takes care of time series patterns in the model residuals.

The model is as follows:

$$ \begin{aligned} y_t &= \beta' X_t + u_t \\ u_t &= \varphi_1 u_{t-1} + \dotsc + \varphi_p u_{t-p} + \varepsilon_t + \theta_1 \varepsilon_{t-1} + \dotsc + \theta_q \varepsilon_{t-q} \end{aligned} $$

with $y_t$ being the dependent variable, $X_t$ containing the regressors and $u_t$ being an error term with an ARMA pattern. The model can be fit in R using the function arima and including exogenous regressors via the option xreg (see Rob J. Hyndman's blog post "The ARIMAX model muddle" for details).

However, this model suits stationary series only. If $y_t$ or some of the $X_t$s are integrated (e.g. behave like random walks), then you should look for transformations (first differencing) or alternative models (e.g. vector error correction model or autoregressive distributed lag model).

Also beware of conditional heteroskedasticity and other types of time series dependence.

Related Question