Solved – Variations in time series

time series

I'm trying to do some really simple Time Series work. I started off looking at monthly counts but the more I thought about it (and the more I read posts by IrishStat) the more I realised my view of this was fundamentally flawed. My main reason for thinking this is the plethora of timing issues there are with using monthly aggregate figures:

  • Varying month length
  • Varying numbers of weekdays
  • Weekends
  • Holidays

But what gets me is that this must be a really common issue, I mean practically every big company out there must have an executive report that includes monthly figures. How does one balance for the discrepancies between the months you are comparing?

I'm using R to slowly crawl through this stuff and currently I'm just looking at cross correlations with lag. I've looked at decomposition and how to adjust for seasonality but how does one adjust for dates that you know will have an effect on business activity? Is there a way to integrate a list of dates that should be treated differently or somehow compensate for the number of weekend/holiday days in a month (or in a week, or in a year)?

A general explanation of the concept would be much appreciated.

Edited to make the question more succinct:

I am looking at a time series of monthly counts. 2 examples from this time series would be:

  • February 2012 is 29 days long, has 8 weekend days and zero public holidays
  • May 2013 is 31 days long, also has 8 weekend days but also has 2 public holidays

How would I compare them without having my conclusions adversely affected by their differences?

  • I could divide by days in month but this doesn't take into account holidays
  • I could divide by working days in month, but this doesn't differentiate between a holiday and a weekend.

Best Answer

You asked for a general explanation of the concept. Your comments about the current status of forecasting at three levels of aggregation is dead on ! My answer may not precisely deal with some of your specific interests as you have focused on some distractions but I thought that I would share the follwing with you. I was asked to discuss how software I had helped write could deal with and accomodate monthly vs weekly vs daily forecasts.

My response was in three parts : A. Overall comments on weekly versus monthly B. The argument for parsing the momrhly forecast to dai;ly using simple ratios C. The argument against #2 and FOR daily forecasts to be DIRECTLY developed and then used to make weekly and/or monthly forecasts.

Response A)

Monthly:

Advantages – Fast to compute, easier to model, easier to identify changes in trends, better for strategic long term forecasting

Disadvantages – If you need to plan as the daily level for capacity, people and spoilage of product then higher levels of forecasting won’t help understand the demand on a daily basis as a 1/30th ration estimate is clearly insufficient.

Causal variables that change on a frequent basis (ie daily/weekly – price, promotion) are not easily integrated into monthly analysis

Integrating Macroeconomic variables like Quarterly Unemployment requires an additional step of creating splines.

Weekly:

Advantages – When you can’t handle the modeling process at a daily level you “settle” for this. When you have very systematic cyclical cycles like “artic ice extents” that follow a rigid curve and not need for day of the week variations.

Disadvantages – Floating Holidays like Thanksgiving, Easter, Ramadan, Chinese New Year change every year and disrupt the estimate for the coefficients for the week of the year impact which CAN be handled by creating a variable for each.

The number of weeks in a year is subject to change and creates a statistical issue due to the fact that every year doesn’t have 52 weeks. We have seen the need to allocate the 53rd week to a “non-player” week to make the data a standard 52 week period which is workable, but disruptive compared to daily data.

Causal variables that change on a frequent basis (ie daily/weekly – price, promotion) are not easily integrated into monthly analysis

Integrating Macroeconomic variables like Quarterly Unemployment requires an additional step of creating splines.

Response B) ( tongue-in-cheek answer )

Assuming you had the daily data in a data warehouse and you wanted to develop daily from the monthly forecasts.

I would take monthly forecasts and partition it to daily in the following manner.

  1. Compute daily averages from the history database thus D1,D2,….D7 averages are known and will be used I would compute the overall average (XBAR) and compute 7 indices I1=D1/XBAR ; I2=D2/XBAR …. I7=D7/XBAR thus the 7 I’s represent percentages i.e.

.9,1,2,…..8 for example.

  1. I would then compute a forecast for DAY1 in the month by using the appropriate I value and get [1/30]*Monthly forecast*I , essentially adjusting the baseline daily forecast of 1/30 th of the monthly expectation.

  2. Finally I would then normalize these DAILY forecasts so that they add to the monthly forecast.

Response C)

I should also add that the procedure I laid out in (B) is subject to a number of assumptions regarding the historical data , most of which are unrealistic in my opinion:

1) That there are no trends and no level shifts . 2) That there are no PULSES ( one time unusual values ) 3) That there are no Holiday effects OR special days in the month effects OR special weeks in the month effects or beginning/end-of the month effects 4) There are no seasonality effects (monthly or weekly ) 5) There have been no changes in the day-of-the-week averages over time 6) There is no autoregressive structure
7) There have been no chnages in model paramters or the error variance over time.

All of these considerations suggest that models should be developed at the daily level in order to provide information as quickly as possible.

Hope this helps !