Solved – How to setup xreg argument in auto.arima() in R?

arimatime series

I am working on a small project with one time series which measures the customer visit data (daily). My covariates are a continuous variable Day to measure how many days have been elapsed since the first day of data collection, and some dummy variables, such as whether that day is Christmas, and which day of the week it is, etc.

Part of my data looks like:

Date    Customer_Visit  Weekday Christmas       Day
11/28/11        2535       2        0            1   
11/29/11        3292       3        0            2   
11/30/11        4103       4        0            3   
12/1/11         4541       5        0            4   
12/2/11         6342       6        0            5  
12/3/11         7205       7        0            6   
12/4/11         3872       1        0            7   
12/5/11         3270       2        0            8   
12/6/11         3681       3        0            9   

My plan is to use ARIMAX model to fit the data. This can be done in R, with the function auto.arima(). I understand that I have to put my covariates into the xreg argument, but my code for this part always returns an error.

Here is my code:

xreg     <- c(as.factor(modelfitsample$Christmas), as.factor(modelfitsample$Weekday), 
              modelfitsample$Day)
modArima <- auto.arima(ts(modelfitsample$Customer_Visit, freq=7), allowdrift=FALSE, 
                       xreg=xreg)

The error message returned by R is:

Error in model.frame.default(formula = x ~ xreg, drop.unused.levels = TRUE) 
 :variable lengths differ (found for 'xreg')

I learned a lot from How to fit an ARIMAX-model with R? But I am still not very clear how to set up the covariates or dummies in the xreg argument in auto.arima() function.

Best Answer

The main problem is that your xreg is not a matrix. I think the following code does what you want. I've used some artificial data to check that it works.

library(forecast)
# create some artifical data
modelfitsample <- data.frame(Customer_Visit=rpois(49,3000),Weekday=rep(1:7,7),
                             Christmas=c(rep(0,40),1,rep(0,8)),Day=1:49)

# Create matrix of numeric predictors
xreg <- cbind(Weekday=model.matrix(~as.factor(modelfitsample$Weekday)), 
                  Day=modelfitsample$Day,
              Christmas=modelfitsample$Christmas)

# Remove intercept
xreg <- xreg[,-1]

# Rename columns
colnames(xreg) <- c("Mon","Tue","Wed","Thu","Fri","Sat","Day","Christmas")

# Variable to be modelled
visits <- ts(modelfitsample$Customer_Visit, frequency=7)

# Find ARIMAX model
modArima <- auto.arima(visits, xreg=xreg)