Solved – Approaches to Forecasting with Daily Timeseries

forecastingrtime series

I have just started to learn about forecasting. I thought it would be easy to create forecast models for a daily time series but have encountered a number of difficulties. Firstly most examples and available datasets are either in months or quarters. It is rare to find examples for weeks and days. Secondly it also appears difficult to create a timeseries object for days (365) and weeks (52) as these vary between years. This may just be the way the timeseries object works in R. I have had to use Zoo. I also have a concern that my data may not be properly modeled for use in packages like Forecast and HTS.

I am interested in how best to approach this problem. Any examples of forecasting to daily events that may cycle across years would be greatly appreciated.

Best Answer

Daily data is often impacted by 1) day-of-the-week-profiles and changes in these profiles ; 2) week of the year ; 3) Time trends ( note the plural ) ; 4) Level Shifts ; 5) monthly effects ; 6) particular days of the month E.G. the first,15th etc ; 7) Lead, contemporaneous and lag effects around known events e.g Christmas , JUly4th etc ) ; 8) Unusual values ; 9)Long weekends around events ; 10) particular week in month effects ; ARIMA structure reflecting unspecified stochastic input series

That should be enough to start with.

When you think that your equation deals effectively with the above then start to consider the impact of parmeters that change over time and an error variance that may change over time.

Related Solutions

Solved – Time series forecast in R with yearly frequency

Try using the tbats() function in the forecast package for R:

x <- ts(myts, frequency=365)
fit <- tbats(x)
plot(forecast(fit))

TBATS is a generalization of ETS models designed to deal with high frequency data. See http://robjhyndman.com/papers/complex-seasonality/ for the JASA paper behind it.

Solved – Time Series Forecasting with Daily Data: ARIMA with regressor

You should be evaluating models and forecasts from different origins across different horizons and not one one number in order to gauge an approach.

I assume that your data is from the US. I prefer 3+ years of daily data as you can have two holidays landing on a weekend and get no weekday read. It looks like your Thanksgiving impact is a day off in the 2012 or there was a recording error of some kind and caused the model to miss the Thanksgiving day effect.

Januarys are typically low in the dataset if you look as a % of the year. Weekends are high. The dummies reflect this behavior....MONTH_EFF01, FIXED_EFF_N10507,FIXED_EFF_N10607

I have found that using an AR component with daily data assumes that the last two weeks day of the week pattern is how the pattern is in general which is a big assumption. We started with 11 monthly dummies and 6 daily dummies. Some dropped out of the model. B**1 means that there is a lag impact the day after a holiday. There were 6 special days of the month (days 2,3,5,21,29,30----21 might be spurious?) and 3 time trends, 2 seasonal pulses (where a day of the week started deviating from the typical, a 0 before this data and a 1 every 7th day after) and 2 outliers (note the thanksgiving!) This took just under 7 minutes to run. Download all results here www.autobox.com/se/dd/daily.zip

It includes a quick and dirty XLS sheet to check to see if the model makes sense. Of course, the XLS % are in fact bad as they are crude benchmarks.

Try estimating this model:

Y(T) =  .53169E+06                                                                                        
       +[X1(T)][(+  .13482E+06B** 1)]                                       M_HALLOWEEN
       +[X2(T)][(+  .17378E+06B**-3)]                                       M_JULY4TH
       +[X3(T)][(-  .11556E+06)]                                            M_MEMORIALDAY
       +[X4(T)][(-  .16706E+06B**-4+  .13960E+06B**-3-  .15636E+06B**-2                                                 
       -  .19886E+06B**-1)]                                                 M_NEWYEARS
       +[X5(T)][(+  .17023E+06B**-2-  .26854E+06B**-1-  .14257E+06B** 1)]   M_THANKSGIVI
       +[X6(T)][(-  71726.    )]                                            MONTH_EFF01
       +[X7(T)][(+  55617.    )]                                            MONTH_EFF02
       +[X8(T)][(+  27827.    )]                                            MONTH_EFF03
       +[X9(T)][(-  37945.    )]                                            MONTH_EFF09
       +[X10(T)[(-  23652.    )]                                            MONTH_EFF10
       +[X11(T)[(-  33488.    )]                                            MONTH_EFF11
       +[X12(T)[(+  39389.    )]                                            FIXED_EFF_N10107
       +[X13(T)[(+  63399.    )]                                            FIXED_EFF_N10207
       +[X14(T)[(+  .13727E+06)]                                            FIXED_EFF_N10307
       +[X15(T)[(+  .25144E+06)]                                            FIXED_EFF_N10407
       +[X16(T)[(+  .32004E+06)]                                            FIXED_EFF_N10507
       +[X17(T)[(+  .29156E+06)]                                            FIXED_EFF_N10607
       +[X18(T)[(+  74960.    )]                                            FIXED_DAY02
       +[X19(T)[(+  39299.    )]                                            FIXED_DAY03
       +[X20(T)[(+  27660.    )]                                            FIXED_DAY05
       +[X21(T)[(-  33451.    )]                                            FIXED_DAY21
       +[X22(T)[(+  43602.    )]                                            FIXED_DAY29
       +[X23(T)[(+  68016.    )]                                            FIXED_DAY30
       +[X24(T)[(+  226.98    )]                                            :TIME TREND        1                   1/  1   1/ 3/2011   I~T00001__010311stack
       +[X25(T)[(-  133.25    )]                                            :TIME TREND      423                  61/  3   2/29/2012   I~T00423__010311stack
       +[X26(T)[(+  164.56    )]                                            :TIME TREND      631                  91/  1   9/24/2012   I~T00631__010311stack
       +[X27(T)[(-  .42528E+06)]                                            :SEASONAL PULSE  733                 105/  5   1/ 4/2013   I~S00733__010311stack
       +[X28(T)[(-  .33108E+06)]                                            :SEASONAL PULSE  370                  53/  6   1/ 7/2012   I~S00370__010311stack
       +[X29(T)[(-  .82083E+06)]                                            :PULSE           326                  47/  4  11/24/2011   I~P00326__010311stack
       +[X30(T)[(+  .17502E+06)]                                            :PULSE           394                  57/  2   1/31/2012   I~P00394__010311stack
      +                    +   [A(T)]

Best Answer

Related Solutions

Solved – Time series forecast in R with yearly frequency

Solved – Time Series Forecasting with Daily Data: ARIMA with regressor

Related Question