Solved – Forecasting technique for daily data with monthly and day of week seasonality

arimaforecastingrstationaritytime series

I have daily data for 3 years. This sales data is of seasonal nature as business has spikes and downfall by month. Also, sales differ by each day of the week. for example, monday in general in a month tend to have similar pattern.

I have used ARIMA and created a matrix of month dummy variables and day of week dummy variables and have passed that in ARIMA. however i hit the bottom when i couldn't reconvert differenced stationary number forecasts into the actual sales metric. Posted here already

I have also tried dummy regression using sales as dependent variable and 11 month dummy variables and 6 day of week dummy variables. i abandoned this as R square was low at 48% and MAPE from the forecasted results was more than 20%

Edit: I have tried auto.arima as well.
My question: What technique can i use for forecasting sales for next 365 days? that will consider this month of the year and day of the week seasonality?

Best Answer

You might want to look at http://www.autobox.com/pdfs/capable.pdf starting with slide 43 for an example and any number of my responses to this list as this subject has come up many times.

The issue is that DAILY DATA can be largely dependent on deterministic variables like day-of-the-week, week-of-the-year, month of-the-year, week-of-the-month, long-weekends, Fridays-before-a-Monday-Holiday or Mondays-after-a-Friday-Holiday and/or particular days-of-the-month effects.

A major hurdle for you is that holidays (before, on and after) are important and heuristics (i.e. not simply done!) are required to identify many of these structures. Furthermore there may be changes in daily patterns over time and different volatilities (uncertainties/variability) for different days of the week.

To determine these factors requires searching for patterns, not just fitting coefficients. Detecting level shifts and local time trends along with one-time unusual values is also critical beside correctly forming an appropriate structure (i.e. ARIMA model identification) is also critical as one needs to craft together a number of competing model possibilities. Finally changing error variance and/or changing model parameters over time need to be considered as they can come into play quite frequently.

In closing one needs to possibly bring user-specified predictor series and their lead and lag structure which may be needed to explain the series of interest.