Solved – Time Series Forecasting: ARIMA/VARIMA vs Machine Learning/Deep Learning

arimaforecastinglstmmachine learningtime series

I am working on the development of a time series forecasting, and I have some doubts on which model I should use to achieve better results.

PREMISE:

  • Multivariate Time Series: my time series is a multivariate one, with different series (features) and a target series.
  • Seasonality: I am pretty sure that there are seasonal patterns and periods that affect the time series behaviour.

Some of my collegue have suggested the use of statistical models like ARIMA/VARIMA (the which I am not familiar with). I know that the basic concept behind this model is to "filter out" the meaningful pattern from the series (trend, seasonality, etc), in order to obtain a stationary time series (e.g. a series with constant mean/variance, which represent basically noise). Basically capturing the time series behaviour and patterns useful for the predictions.

I, in contrast, am more familiar with Deep Learning models (e.g. LSTM, RNN) and traditional machine learning models (Random Forest). Some of these models have a sort of memory, others not.

QUESTIONS:

Is there some thumb rule to choose between these models? For example, knowing that my time series has some well known pattern like seasonality, trend, etc., why should I use ARIMA instead of Random Forest model or an LSTM? How can I make sure that I am using ARIMA correctly (underlying assumptions)?

WHAT I KNOW:

  • Usually, time series datasets are smaller in size than other big datasets, and deep learning models are not so powerful on this kind of data. Some of these models (RNN/LSTM) take into consideration the sequentiality of the data.

  • Classical machine learning models don't take into consideration the sequentiality of the data, but work better an smaller dataset (Random Forest).

  • Classical statistical models are statistically robust, but they work with some kind of assumptions on the time series.

  • With a smaller dataset, usually traditional machine learning models perform better working on the features engineering and model tuning.

Best Answer

As a personal rule of thumb I begin by applying simple statistical models (ARIMA, exponential smoothing) because they require less computations and are generally more interpretable. Moreover, there exist automated packages (such as the forecast package) that take care the task of model selection.

Generally, I would expect better predictive performance by applying advanced machine learning algorithms, especially when there are a lot of external predictors. However, there is no guarantee about that. In any case, you have to consider how much time you want to devote in constructing a model, hyperparameter tuning, etc.

Related Question