In-sample fits are not a reliable guide to out-of-sample forecasting accuracy. The gold standard in forecasting accuracy measurement is to use a holdout sample. Remove the last 30 days from the training sample, fit your models to the rest of the data, use the fitted models to forecast the holdout sample and simply compare accuracies on the holdout, using Mean Absolute Deviations (MAD) or weighted Mean Absolute Percentage Errors (wMAPEs).
Here is an example using R. I am using the 2000th series of the M3 competition, which already is divided into the training series M3[[2000]]$x
and the test data M3[[2000]]$xx
. This is monthly data. The last two lines output the wMAPE of the forecasts from the two models, and we see here that the ARIMA model (wMAPE 18.6%) outperforms the automatically fitted ETS model (32.4%):
library(forecast)
library(Mcomp)
M3[[2000]]
ets.model <- ets(M3[[2000]]$x)
arima.model <- auto.arima(M3[[2000]]$x)
ets.forecast <- forecast(ets.model,M3[[2000]]$h)$mean
arima.forecast <- forecast(arima.model,M3[[2000]]$h)$mean
sum(abs(ets.forecast-M3[[2000]]$xx))/sum(M3[[2000]]$xx)
sum(abs(arima.forecast-M3[[2000]]$xx))/sum(M3[[2000]]$xx)
In addition, it looks like there are abnormally high sales near indices 280-300. Could this be Christmas sales? If you know about calendar events like these, it would be best to feed those to your forecasting model as explanatory variables, which will give you a better forecast next time that Christmas rolls around. You can do that easily in ARIMA(X) and NNs, not so easily in ETS.
Finally, I recommend this textbook on forecasting: http://otexts.com/fpp/
I found answer in stackoverflow. To summarize instead of doing
ARIMAfit <- auto.arima(diff(diff(val.ts)), approximation=FALSE,trace=FALSE, xreg=diff(diff(xreg)))
we should instead do
ARIMAfit <- auto.arima(val.ts, d=2, approximation=FALSE,trace=FALSE, xreg=xreg)
This d=2 will make sure that forecasted values for future are also in the same metric.
so if i do forecast(ARIMAfit,h=300,xreg=testxreg), i will be able to get future 300 values.
Best Answer
Be careful the presence of outliers will often cause the box-cox test to incorrectly suggest a power transformation that is uneeded . The box-cox test When (and why) should you take the log of a distribution (of numbers)? can often be misleading and should be used when the required assumptions are met i.e. no pulses , no seasonal pulses ; no level shifts and no trends in the residuals and of course no deterministic change points in error variance at particular points in time.
For more on this please see http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.469.7176&rep=rep1&type=pdf and in particular the section 7 on the "Effect of outliers and influential cases"