Solved – Combining auto.arima() and ets() from the forecast package

arimaexponential distributionforecastingrtime series

I've been using the ets() and auto.arima() functions from the forecast package to forecast a large number of univariate time series. I've been using the following function to choose between the 2 methods, but I was wondering if CrossValidated had any better (or less naive) ideas for automatic forecasting.

auto.ts <- function(x,ic="aic") {
    XP=ets(x, ic=ic) 
    AR=auto.arima(x, ic=ic)

    if (get(ic,AR)<get(ic,XP)) {
        model<-AR
    }
    else {
        model<-XP
    }
        model
}

/edit: What about this function?

auto.ts <- function(x,ic="aic",holdout=0) {
    S<-start(x)[1]+(start(x)[2]-1)/frequency(x) #Convert YM vector to decimal year
    E<-end(x)[1]+(end(x)[2]-1)/frequency(x)
    holdout<-holdout/frequency(x) #Convert holdout in months to decimal year
    fitperiod<-window(x,S,E-holdout) #Determine fit window

    if (holdout==0) {
        testperiod<-fitperiod
    }
    else {
        testperiod<-window(x,E-holdout+1/frequency(x),E) #Determine test window
    }

    XP=ets(fitperiod, ic=ic)
    AR=auto.arima(fitperiod, ic=ic)

    if (holdout==0) {
        AR_acc<-accuracy(AR)
        XP_acc<-accuracy(XP)
    }
    else {
        AR_acc<-accuracy(forecast(AR,holdout*frequency(x)),testperiod)
        XP_acc<-accuracy(forecast(XP,holdout*frequency(x)),testperiod)
    }

    if (AR_acc[3]<XP_acc[3]) { #Use MAE
        model<-AR
    }
    else {
        model<-XP
    }
    model
}

The "holdout" is the number of periods you wish to use as an out of sample test. The function then calculates a fit window and a test window based on this parameter. Then it runs the auto.arima and ets functions on the fit window, and chooses the one with the lowest MAE in the test window. If the holdout is equal to 0, it tests the in-sample fit.

Is there a way to automatically update the chosen model with the complete dataset, once it has been selected?

Best Answer

The likelihoods from the two model classes, and hence the AIC values, are not comparable due to different initialization assumptions. So your function is not valid. I suggest you try out the two model classes on your series and see which gives the best out-of-sample forecasts.