The reason is that ARIMA is auto-projective which uses the most recent data to compute essentially a weighted average of past values. When forecasting, the 1 step ahead is used to predict the second step etc. .This then leads to long term forecasts that approach an asymptote. When fitting , the actual history is used to predict the next point. What you should do is to build an integrated model that includes deterministic structure like day-of-the-week, changes in the day-of-the-week coefficients, monthly/weekly effects taking into account any events like holidays and include any needed level shifts or local time trends that can be detected and seamlessly incorporated. Additionally particular days of the month and particular weeks of the month may come into play. This is done by hour and by daily sums. Furthermore use the daily sums history and its forecasts as a possible predictor variable for each of the 24 hourly models. Make sure that you verify that the parameters of each of your 25 models are invariant over time and that the error variance for each of your 25 models doesn't change over time. Finally you may need to incorporate a parent-to-child or child-to-parent strategy to reconcile any differences. We have been very successful using these procedures and you should also be succesful.
- How do I select the best ARIMA model (by trying all different orders and checking the best MASE/MAPE/MSE? where the selection of performance measurement can be a discussion in it's own..)
Out of sample risk estimates are the gold standard for performance evaluation, and therefore for model selection. Ideally, you cross-validate so that your risk estimates are averaged over more data. FPP explains one cross-validation method for time series. See Tashman for a review of other methods:
Tashman, L. J. (2000). Out-of-sample tests of forecasting accuracy: an analysis and review. International Journal of Forecasting, 16(4), 437–450. doi:10.1016/S0169-2070(00)00065-0
Of course, cross-validation is time consuming and so people often resort to using in-sample criteria to select a model, such as AIC, which is how auto.arima selects the best model. This approach is perfectly valid, if perhaps not as optimal.
- If I generate a new model and forecast for every new day forecast (as in online forecasting), do I need to take the yearly trend into account and how? (as in such a small subset my guess would be that the trend is neglible)
I'm not sure what you mean by yearly trend. Assuming you mean yearly seasonality, there's not really any way to take it into account with less than a year's worth of data.
- Would you expect that the model order stays the same throughout the dataset, i.e. when taking another subset will that give me the same model?
I would expect that barring some change to how the data are generated, the most correct underlying model will be the same throughout the dataset. However, that's not the same as saying that the model selected by any procedure (such as the procedure used by auto.arima) will be the same if that procedure is applied to different subsets of the data. This is because the variability due to sampling will result in variability in the results of the model selection procedure.
- What is a good way, within this method to cope with holidays? Or is ARIMAX with external holiday dummies needed for this?
External holiday dummies is the best approach.
- Do I need to use Fourier series approach to try models with
seasonality=672
as discussed in Long seasonal periods?
You need to do something, because as mentioned in that article, the arima function in R does not support seasonal periods greater than 350. I've had reasonable success with the Fourier approach. Other options include forecasting after seasonal decomposition (also covered in FPP), and exponential smoothing models such as bats and tbats.
- If so would this be like
fit<-Arima(timeseries,order=c(0,1,4), xreg=fourier(1:n,4,672)
(where the function fourier is as defined in Hyndman's blog post)
That looks correct. You should experiment with different numbers of terms. Note that there is now a fourier
function in the forecast package with a slightly different specification that I assume supersedes the one on Hyndman's blog. See the help file for syntax.
- Are initial P and Q components included with the fourier series?
I'm not sure what you're asking here. P and Q usually refer to the degrees of the AR and MA seasonal components. Using the fourier approach, there are no seasonal components and instead there are covariates for fourier terms related to season. It's no longer seasonal ARIMA, it's ARIMAX where the covariates approximate the season.
Best Answer
In theory it be possible to combine Croston's and ARIMA. Croston's splits the time series into two component series (a demand interval series and a demand size series) and then uses exponential smoothing to forecast each series separately before recombining them into a rate of demand (or rate of sales) series. You could blend Croston's and ARIMA by using an ARIMA or ARMA model to forecast one or two of the components instead of exponential smoothing.
In practice it would be difficult though, since ARIMA requires a lot of data to be effective (otherwise exponential smoothing or a simple moving average works better). This means that since your data is intermittent, you would need to make up for that by having a very long series (keep in mind, by using Croston's you are effectively shortening your series - if you have 1 year of weekly data, i.e. 52 weeks, and you have a sales on average every two weeks, your series gets shortened to ~20 weeks which is very short for a time series to use ARIMA). So you need several years instead of just 2 or 3 years of weekly data. This is probably why you don't see the idea of mixing ARIMA and Croston's in the literature.
This problem is even more pronounced if you want to include external variables in your ARIMA model, because in Croston's you are rescaling your demand time series to a what is essentially a variable time scale. I don't see how can you rescale the external variables in such a way that it is synch with the component series that Croston's produces.
So yeah, combining Croston's and ARIMA would be very difficult to pull off.
Yes. For intermittent demand forecasting, you can use Negative Binomial distribution or Poisson distribution along with GLMs to model intermittent demand with causal factors.