I am working on project to forecast sales of stores to learn forecasting. Until now I have successfully used simple auto.arima
function for forecasting. But to make these forecast more accurate I can make use of covariates. I have defined covariates like holidays, promotion which affect on sales of store using xreg
argument with the help of this post:
How to setup xreg argument in auto.arima() in R?
But my code fails at line:
ARIMAfit <- auto.arima(saledata, xreg=covariates)
and gives error saying:
Error in model.frame.default(formula = x ~ xreg, drop.unused.levels = TRUE) :
variable lengths differ (found for 'xreg')
In addition: Warning message:
In !is.na(x) & !is.na(rowSums(xreg)) :
longer object length is not a multiple of shorter object length
Below is link to my Dataset:
https://drive.google.com/file/d/0B-KJYBgmb044blZGSWhHNEoxaHM/view?usp=sharing
This is my code:
data = read.csv("xdata.csv")[1:96,]
View(data)
saledata <- ts(data[1:96,4],start=1, end=96,frequency =7 )
View(saledata)
saledata[saledata == 0] <- 1
View(saledata)
covariates = cbind(DayOfWeek=model.matrix(~as.factor(data$DayOfWeek)),
Customers=data$Customers,
Open=data$Open,
Promo=data$Promo,
SchoolHoliday=data$SchoolHoliday)
View(head(covariates))
# Remove intercept
covariates <- covariates[,-1]
View(covariates)
require(forecast)
ARIMAfit <- auto.arima(saledata, xreg=covariates)//HERE IS ERROR LINE
summary(ARIMAfit)
Also tell me how I can forecast for the next 48 days. I know how to forecast using simple auto.arima
using the argument n.ahead
but I don't know how to do it when the argument xreg
is used.
Best Answer
Basically what caused the issue is the line
ts(data[1:96,4],start=1, end=96,frequency =7 )
, when you specify both start and end withfrequency = 7
,R
is trying multiply the series so that it has a length of 96 weeks.Recall
R
defines the start and end time in seasons (weeks in your case). Since you are fitting daily data, only specifyingstart = 0
orstart = 1
should be sufficient.Instead of running
View(saledata)
, try to usesaledata
to debug yourself and you can see wrong length of time series is outputted .When you do ARIMA forecast with
xreg
, basically you will need to create a matrixnewxreg
for your next 48 days with the same structure asxreg
, then specifynewxreg = newxreg
in theforecast
function. A good habit for thexreg
andnewxreg
matrix would be to include aDay
column that acts as an ordering for the data.