I'm working on a forecasting weekly sales by category. I want to make sure I'm doing it correctly.
date DiningSales
3/1/2015 243334
3/8/2015 556637
3/15/2015 554315
......
10/1/2017 343660
I've read Rob Hyndman website and see he recommends TBATS for weekly data. My date range is from 3/1/2015-10/1/2017. I want to predict the next 4 weeks. Here is my code:
dining.data <- read.csv("sales_dining.csv", fileEncoding="UTF-8-BOM")
dining.df <- dining.data$DiningSales
dining.ts <- ts(dining.df,
freq=365.25/7, start=2015+59/365.25)
dining.tbats <- tbats(dining.ts)
dining.fc <- forecast(dining.tbats, h=4)
plot(dining.fc, ylab="dining orders")
The code runs but my predictions for the next 4 weeks seem way off. My questions are:
- Did I use the correct frequency to get weekly data?
- Is my start date correct? I want the start date to be 3/1/2015.
- The next four predictions change the date to 10/7 to 2017.768. Is there a way to have the date in the results be in an actual date format?
- Finally it seems like there isn't a lot out there for weekly forecast, rather monthly, annual or even daily or hourly. Is weekly data harder to have a more accurate prediction?
Best Answer
freq=365.25/7
means you are assuming a year-long seasonality in your data. Since your results don't look good, try removing the seasonality by settingfreq=1
.You can also try using the
ets
andauto.arima
functions withfreq=1
to see what kind of results they give. If the seasonality of your data is not strong, using a non-seasonal model will likely give better results.You don't have a ton of data, but you could perhaps hold out a test set of 1 year to compare the
tbats
,ets
, andauto.arima
predictions.If you have an external data that might affect sales (e.g. holidays or promotions) you can include that data in the
auto.arima
model, but nottbats
orets
.Also,
243334
on 3/1/2015 vs556637
on 3/8/2015 is a pretty huge jump for 1 week: are you sure you don't have a data quality issue?