Solved – When to use ARIMA model vs linear regression

arimaforecastingregression

I am trying to forecast time series of product sales, I started approaching the problem by implementing the ARIMA model, I iterated over all the possibilities of the models parameters (p, d, q) and picked the one with least RMSE, problem is the forecast is not as good as I wanted it to be, so I started studying other ways of prediction, like regression.

After plotting my data in a cumulative plot, I noticed that most of the time series I had are fairly linear, so probably I can fit a linear regression model on them.

What should I use in my case, ARIMA model or linear regression, and what does ARIMA model has to offer than regression does not for it to compensate for being more complicated.

Here is a screenshot of my ARIMA forecast, and cumulative plot (weekly):

enter image description here

Note that 373 is the RMSE of the time series forecast, blue is prediction, red is test data

enter image description here

This is my data per month, the model is acting even worse in predicting the data.

Best Answer

First of all … you should model what is observed NOT what is accumulated . Secondly an ARIMA model can evolve into a time trend model with Intervention Detection with the potential of detecting breakpoints in trend. Stay way clear of simple ols models with trend or trend squared unless theory ( domain knowledge )tells you so .

Closely review a piece I wrote contrasting and comparing ARIMA with Regression a few years back. https://autobox.com/pdfs/regvsbox-old.pdf

EDITED AFTER RECEIPT OF INTERMITTENT DEMAND DATA:

The data you have ( although daily ) does not have values for every day thus one can't build a daily model like Simple method of forecasting number of guests given current and historical data

Secondly you don't have data for each and every week of the year thus you can't build a weekly model as is done in these examples https://stats.stackexchange.com/search?q=user%3A3382+weekly

So all you have left is a monthly model. I propose that you reassemble your data into monthly buckets (totals by month) and repost your data to the web and I will try and help further.

EDITED AFTER RECEIPT OF 46 MONTHLY VALUES STARTING ATT 2012/3:

You say "how poorly ARIMA model is predicting my monthly data: . I say your chice of arima software and approach is performing poorly due to at least 3 Gaussian violations viz 1) There are identifiable pulses in the data ; 2) There is an identifiable level/step shift down in the data ; 3) there is an identifiable error variance reduction/change in the data. I used AUTOBOX which I have helped to develop which has features to deal with data like this.

Your data is here enter image description here

A useful arima model is here (2,0,0)(0,0,0)12 enter image description here and here enter image description here

A significant reduction in the model error variance was detected at period 27 enter image description here

The Actual/Fit and Forecast graph is here enter image description here