You should be evaluating models and forecasts from different origins across different horizons and not one one number in order to gauge an approach.
I assume that your data is from the US. I prefer 3+ years of daily data as you can have two holidays landing on a weekend and get no weekday read. It looks like your Thanksgiving impact is a day off in the 2012 or there was a recording error of some kind and caused the model to miss the Thanksgiving day effect.
Januarys are typically low in the dataset if you look as a % of the year. Weekends are high. The dummies reflect this behavior....MONTH_EFF01, FIXED_EFF_N10507,FIXED_EFF_N10607
I have found that using an AR component with daily data assumes that the last two weeks day of the week pattern is how the pattern is in general which is a big assumption. We started with 11 monthly dummies and 6 daily dummies. Some dropped out of the model. B**1 means that there is a lag impact the day after a holiday. There were 6 special days of the month (days 2,3,5,21,29,30----21 might be spurious?) and 3 time trends, 2 seasonal pulses (where a day of the week started deviating from the typical, a 0 before this data and a 1 every 7th day after) and 2 outliers (note the thanksgiving!) This took just under 7 minutes to run. Download all results here www.autobox.com/se/dd/daily.zip
It includes a quick and dirty XLS sheet to check to see if the model makes sense. Of course, the XLS % are in fact bad as they are crude benchmarks.
Try estimating this model:
Y(T) = .53169E+06
+[X1(T)][(+ .13482E+06B** 1)] M_HALLOWEEN
+[X2(T)][(+ .17378E+06B**-3)] M_JULY4TH
+[X3(T)][(- .11556E+06)] M_MEMORIALDAY
+[X4(T)][(- .16706E+06B**-4+ .13960E+06B**-3- .15636E+06B**-2
- .19886E+06B**-1)] M_NEWYEARS
+[X5(T)][(+ .17023E+06B**-2- .26854E+06B**-1- .14257E+06B** 1)] M_THANKSGIVI
+[X6(T)][(- 71726. )] MONTH_EFF01
+[X7(T)][(+ 55617. )] MONTH_EFF02
+[X8(T)][(+ 27827. )] MONTH_EFF03
+[X9(T)][(- 37945. )] MONTH_EFF09
+[X10(T)[(- 23652. )] MONTH_EFF10
+[X11(T)[(- 33488. )] MONTH_EFF11
+[X12(T)[(+ 39389. )] FIXED_EFF_N10107
+[X13(T)[(+ 63399. )] FIXED_EFF_N10207
+[X14(T)[(+ .13727E+06)] FIXED_EFF_N10307
+[X15(T)[(+ .25144E+06)] FIXED_EFF_N10407
+[X16(T)[(+ .32004E+06)] FIXED_EFF_N10507
+[X17(T)[(+ .29156E+06)] FIXED_EFF_N10607
+[X18(T)[(+ 74960. )] FIXED_DAY02
+[X19(T)[(+ 39299. )] FIXED_DAY03
+[X20(T)[(+ 27660. )] FIXED_DAY05
+[X21(T)[(- 33451. )] FIXED_DAY21
+[X22(T)[(+ 43602. )] FIXED_DAY29
+[X23(T)[(+ 68016. )] FIXED_DAY30
+[X24(T)[(+ 226.98 )] :TIME TREND 1 1/ 1 1/ 3/2011 I~T00001__010311stack
+[X25(T)[(- 133.25 )] :TIME TREND 423 61/ 3 2/29/2012 I~T00423__010311stack
+[X26(T)[(+ 164.56 )] :TIME TREND 631 91/ 1 9/24/2012 I~T00631__010311stack
+[X27(T)[(- .42528E+06)] :SEASONAL PULSE 733 105/ 5 1/ 4/2013 I~S00733__010311stack
+[X28(T)[(- .33108E+06)] :SEASONAL PULSE 370 53/ 6 1/ 7/2012 I~S00370__010311stack
+[X29(T)[(- .82083E+06)] :PULSE 326 47/ 4 11/24/2011 I~P00326__010311stack
+[X30(T)[(+ .17502E+06)] :PULSE 394 57/ 2 1/31/2012 I~P00394__010311stack
+ + [A(T)]
Let's take your questions one at a time:
- Is there any method to determine optimal smoothing parameters without testing all of them?
You can cast your problem in a state space framework and then numerically optimize your parameters using standard numerical libraries. Forecasting with Exponential Smoothing - The State Space Approach by Hyndman et al. would be a good place to start.
- Using R.NET to use the forecast package of R will be faster?
Hard to say. I have no experience with R.NET. Using, e.g., ets()
(which does use a state space approach) in the forecast
package directly will certainly be faster, especially if, as you do, you specify the model rather than letting ets()
find the (hopefully) best one.
If so, should I:
- Use daily or monthly data?
This should really depend on what you want to do with the forecast. What forecast granularity do you really need to make decisions? Sometimes, it is better to forecast higher-frequency data and then aggregate, but usually, I'd rather aggregate the history and forecast on the granularity I plan on using later on.
Plus, daily data will likely be intermittent, in which case you can't use Holt(-Winters) or ARIMA, but should go with Croston's method. This may be helpful. Intermittent demands are usually harder to forecast.
EDIT: you write that you need to determine safety amounts. Well, now you will actually need to think about your supply chain. Maybe forecasting is not your problem at all - if sales are all 0 or 1 and you can replenish stocks within a day, your best strategy would be to always have 1 unit on hand and replenish that after every sale, forgetting entirely about forecasting.
If that is not the case (you write that you have seasonality on an aggregate level), you may need to do something ad-hoc, since I don't think there is anything on seasonal intermittent demand out there. You could aggregate data to get seasonal forecasts, then push those down to the SKU level to get forecasts on that level (e.g., by distributing the aggregate forecasts according to historical proportions), finally get safety amounts by taking quantiles of, e.g., the Poisson distribution. As I said, this is pretty ad-hoc, with little statistical grounding, but it should get you 90% there - and given that forecasting is an inexact science, the last 10% may not be feasible, anyway.
- Make also an auto.arima? How to determine which model is better?
Yes, try that one, too. Use a holdout sample to determine which model is better, as you describe in your comment, which is very good practice.
Look also at averages of forecasts from different methods - often, such averages yield better forecasts than the component forecasts. EDIT: That is, fit both a Holt-Winters and an auto.arima
model, calculate forecasts from both models, and then, for each time bucket in the future, take the average of the two forecasts from the two models. You can do this with even more models, too - averaging seems to work best if the component models are "very different". Essentially, you are reducing the variance of your forecasts.
- Is my method of backtesting (make a model only with data previous to that point) valid to determine if a model is better than another?
As I wrote above: yes, it is. This is out-of-sample testing, which is really the best way to assess forecast accuracy and method quality.
- EDIT: How can I do that (get predictions over the last 12 data without considering them in the model) in R?
Unfortunately, there is no way to take an ets()
-fitted object and update it with a new data point (as in update()
for lm()
-fitted models). You will need to call ets()
twelve times.
You could, of course, fit the first model and then re-use the model ets()
chose in this first fit for subsequent refits. This model is reported in the components
part of the ets()
result. For instance, taking the first five years of the USAccDeaths
dataset:
fit <- ets(ts(USAccDeaths[1:60],start=c(1973,1),frequency=12))
Refit using the same model:
refit <- ets(ts(USAccDeaths[1:61],start=c(1973,1),frequency=12),
model=paste(fit$components[1:3],collapse=""))
This will make refitting quite a lot faster, but of course the refit may not find the MSE-optimal model any more. Then again, the MSE-optimal model should not change too much if you add just a few more observations.
As always, I highly recommend this free online forecasting textbook.
Best Answer
Do you intend to model $x_t$ as an ARIMAX process where the exogenous regressors are distributed lags of $x_t$? That sounds peculiar. Why not stick to either a pure ARIMA model or a pure distributed lag model for $x_t$?
Yes, iterative prediction which you suggest in the last paragraph seems a reasonable solution given your setting. It is quite commonly used in ARIMA and VAR type of models, for example.
Note that predicting 365 days ahead using iterative forecasting with ARIMA type of models may be quite disastrous; forecast errors will compound and get way out of hand. While the first few forecasts may be fine, do not expect high forecast accuracy beyond that; actually, some naive forecast (sample mean, last observed value or the like) will likely do better than iterated ARIMA for distant forecast horizons.
Note also that functions
arima
orauto.arima
with exogenous regressors implement regression with ARMA errors rather than the ARIMAX model; see more here.(I had some trouble reading your text, so let me know if I misunderstood something.)