Solved – Why choose a non-seasonal model and manually perform seasonal adjustment instead of using a seasonal model

forecastingseasonalitytime series

Given that seasonal models such as Holt-Winters and SARIMA estimate seasonal components of a time series, when would it be necessary or advantageous to manually apply seasonal decomposition and then fit a non-seasonal model?

Another way to ask this question is why or when would one choose a non-seasonal model with seasonal adjustment (e.g. manually calculate seasonal indices using ratio to moving average and apply simple exponential smoothing) over a seasonal model when the data are seasonal?

Best Answer

I can think of several possible reasons.

Might be the only tool you've got

These days this isn't particularly credible as a reason because even Excel (recent versions) has built-in ways to handle seasonality but I suspect it's the most common in actuality. If the analyst doesn't have access to software that can do Holt-Winters or SARIMA modelling but is basically doing everything by hand, it might be much easier to do a basic seasonal decomposition first and then forecast that with a simple method.

Officially seasonally adjusted series might be the variable of interest

For some variables like unemployment, nearly all the attention and public policy debate is on the seasonally adjusted values, with the seasonal adjustment done by an national statistics office. If seasonality is just a nuisance factor and you only ever want to forecast the seasonally adjusted value, it is much simpler to project forwards the officially adjusted series than to forecast the original one and perform your own seasonal adjustment on it, introducing new sources of variance with what eventually happens and gets published.

This isn't quite your example I know because you're asking about "why would someone manually seasonally adjust the data themselves" but I'm putting it in to round out the picture.

Seasonality might exacerbate problems with variance

If you want to use a method that requires second order stationarity and you have a time series that exhibits variance increasing with the mean, you strictly speaking need to do something about it. Taking logarithms or (more generally) a Box-Cox transformation is often how it is done, but what if there are other reasons that you want to stick to the untransformed version (eg maybe theoretical relationship between explanatory variable and the target variable).

Seasonal adjustment might be a way out of this, or at least to reduce the problem. Consider this micro-example (in R) with the famous AirPassengers data. The original data is usually log-transformed ($\lambda = 0$ in a Box-Cox transform), and in fact might need something even more aggressive than that ($\lambda < 0$). But once it has been seasonally adjusted, you might get away with a square root transform (more or less $\lambda = 0.5$)

library(forecast)
library(seasonal)
library(ggplot2)
library(gridExtra)

p1 <- autoplot(AirPassengers) + ggtitle("Air Passengers", "Original data")
ap_sa <- final(seas(AirPassengers))
p2 <- autoplot(ap_sa) + ggtitle("Air Passengers", "Seasonally adjusted")

BoxCox.lambda(AirPassengers) # -0.29
BoxCox.lambda(ap_sa) # 0.38

grid.arrange(p1, p2)

enter image description here

Basically, the seasonal adjustment has made much of the variance problem go away; shown by the higher value of $\lambda$ recommended by the BoxCox.lambda function as needed to get a constant coefficient of variation.

Fits in with a general exploratory analysis workflow

For many purposes, seasonality is a nuisance that hides the relationships between variables. This is shown in the example below (sorry, not great, but might do the point), with the Chinese import and export data that comes with the seasonal package.

library(tidyverse)
library(seasonal)
china <- data.frame(exports = exp, imports = imp)

p3 <- ggplot(china, aes(x = imports, y = exports)) +
  geom_point() +
  geom_path() +
  scale_x_sqrt() +
  scale_y_sqrt() +
  coord_equal()

p4 <- china %>%
  map_df(function(x){
    x = ts(x, start = c(1983, 7), frequency = 12)
    return(final(seas(x)))
  }) %>%
  ggplot(aes(x = imports, y = exports)) +
  geom_point() +
  geom_path() +
  scale_x_sqrt() +
  scale_y_sqrt() +
  coord_equal()

grid.arrange(p3, p4, ncol = 2)

enter image description here

The relationship between exports and imports - both growing very fast - is clearer in the seasonally adjusted graphic on the right (and with a better example dataset, this is sometimes much more so). If the forecasting being done is basically an extension of a whole bunch of analysis and exploration along those lines, it might make sense to use the seasonally adjusted series all the way through.

It seems to work

This question prompted me to do a systematic check of about 3,000 monthly and quarterly data sets from the M3 and Tourism forecasting competitions. The stlm function in Rob Hyndman's forecast package basically does seasonal adjustment first before using ARIMA or exponential smoothing state space models for forecasting, and this can be compared with using the auto.arima and ets functions on the original data. Somewhat to my surprise I found that at least in the case of ARIMA models, the seasonal adjustment done beforehand by stlm led to better average performance than a seasonal ARIMA model. No noticeable difference with the ets models. Results summarised in this graphic:

enter image description here

Full write up available on my blog.