Solved – Simple method of forecasting number of guests given current and historical data

distributionsforecastingprediction intervaltime series

I am trying to predict the number of guests a restaurant might serve in a meal period based on the volume of business that same day from prior years (3-5 years of data), trends for the same day of the week in the recent past (6-8 weeks), and the daily trend vs last year for the 21 days leading up to the day in question. For example, I would like to predict the number of lunches I might serve on Wednesday, Nov 15, 2017 using data from each of the same Wednesdays from the past 3-5 years, as well as the 6-8 Wednesdays immediately prior to 11/15/17, and the daily trends vs last year for each of the 21 days prior to 11/15/17. Ideally, I would like a to return a result showing a low end, high end, and most likely. I am not a statistician, just a restaurant manager trying to take some of the guesswork out of staffing and ordering. I am with a small company that uses Excel for most spreadsheet applications.

Best Answer

There are simple methods to use but they are probably profoundly wrong as daily data presents a ton of opportunities . Simply striking daily averages is both simple and useless BUT I guess if there is no analytics around it is probably better than the overall average . The absolute LAST approach would be to use the overall mean . Models need to be as simple as possible BUT never too simple.

As was nicely summarized by @Frans your problem/opportunity is a complicated one but very rewarding. Besides some of the items mentioned there are individual lead and lag effects around each holiday along with possible level shifts and changes in day-of-the-week effects and of course how to identify and treat anomalies. There are also possible week-of-the=month effects and day-of-the-month effects et al. Identifying the structure is the problem and possibly even incorporating pricing and advertising effects.

Take a look at http://autobox.com/cms/index.php/afs-university/intro-to-forecasting/doc_download/53-capabilities-presentation particularly slide 50- . I have been working with fast-food restaurant chains to even push down the forecast to 15 minute intervals and will try and give you some guidance. If you post your data and specify the country and the start date , I will try and help further. Preferably you might post 3 years of daily data as there are probably seasonal and holiday effects that might need to be identified.

In terms of being able to quickly come up with forecasts this is handled by storing and updating models and then quickly forecasting using models that have been archived. Prediction intervals should be approached via monte-carlo to provide robust estimates for the range of future values.

You have a complicated problem AND there are a lot of bad simple solutions that may be insufficient but inexpensive. if this is important then perhaps you need to muscle-up to a workable & affordable solution.

EDIT UPON RECEIPT OF DATA:

After receipt of your data I arbitrarily took the LUNCH series (you had provided both LUNCH and DINNER data )and inserted 0.0's for some missing days obtaining 1454 daily values .. start date 1/1/2013 ending 12/24/2016 and introduced the data to AUTOBOX requesting a 14 day forecast.

Here is the 14 day (arbitrarily chosen) forecast enter image description here . The acf of the original data showed significant memory structure which is of course presuming no special causes enter image description here while the acf of the final model's residuals showed no remaining stochastic structure in the residuals enter image description here . Since the sample size is large we get "false conclusions" using the very approximate standard deviation of the acf (1/sqrt(# of observations). The plot of the final model's residuals supported the randomness conclusion or at least the suggestion that the model couldn't be rejected enter image description here

How to evaluate deterministic vs stochastic components of a time series? discusses the advantages of integrating both stochastic (arima/memory) structure and event /fixed effects found via search procedures culminating in a holistic model.

Restaurant activity is a classic example of how we do things in predictable rhythms. Arrivals to a restaurant follow day-of-the-week patterns and monthly patterns albeit being very affected by holidays and other special events. To summarize the model contains 6 types of factors/features separating the observed series to signal(predictable) and noise(random) .These 6 features are 1) Baseline ; 2) day-of-the week ; 3) month-of-the-year ; 4) pre, contemporary and post holiday effects ; 5) Deterministic effects discovered via Intervention Detection ; 6) memory (previous values).

The Final model's statistics are here enter image description here with Actual/Fit and Forecast here enter image description here

Detailing the 6 features . First the baseline ..essentially an expectation before identified effects are introduced .

enter image description here

now the day-of-the week

enter image description here

now the month=of-the-year

enter image description here

now the holidays

enter image description here

now the identified exogenous deterministic/unattributed effects (partial list)

enter image description here

and finally the effect of prior observations i.e. memory reflecting unspecified variables omitted from the model . This is the conditional effect of memory GIVEN the deterministic (assignable cause) structure

enter image description here

The window of response around each holiday is presented using the backshift operator B ; https://en.wikipedia.org/wiki/Lag_operator

Related Question