Solved – How to correct outliers once detected for time series data forecasting

forecastingoutlierstime serieswinsorizing

I'm trying to find a way of correcting outliers once I find/detect them in time series data. Some methods, like nnetar in R, give some errors for time series with big/large outliers. I already managed to correct the missing values, but outliers are still damaging my forecasts…

Best Answer

There is now a facility in the forecast package for R for identifying and replacying outliers. (It also handles the missing values.) As you are apparently already using the forecast package, this might be a convenient solution for you. For example:

fit <- nnetar(tsclean(x))

The tsclean() function will fit a robust trend using loess (for non-seasonal series), or robust trend and seasonal components using STL (for seasonal series). The residuals are computed and the following bounds are computed:

\begin{align} U &= q_{0.9} + 2(q_{0.9}-q_{0.1}) \\ L &= q_{0.1} - 2(q_{0.9}-q_{0.1}) \end{align} where $q_{0.1}$ and $q_{0.9}$ are the 10th and 90th percentiles of the residuals respectively.

Outliers are identified as points with residuals larger than $U$ or smaller than $L$.

For non-seasonal time series, outliers are replaced by linear interpolation. For seasonal time series, the seasonal component from the STL fit is removed and the seasonally adjusted series is linearly interpolated to replace the outliers, before re-seasonalizing the result.