Solved – How to estimate weekly and daily seasonality for data with 15min frequency in Python

forecastingmultiple-seasonalitiespythonstatsmodelstime series

I am relatively new to time series. My goal is to predict a few hours of data, measured every 15min based on three months of observations in Python. I assume I have daily and weekly cycles which I want to estimate and apply ARMA on residuals obtained from subtracting this seasonal trend. I am using statsmodels-0.8.0, the latest version from master branch.

I tried several approaches which did not work. Any input is much appreciated, if you can comment on possible errors or suggest further steps, or better approaches. The list of things I tried:

  1. SARIMAX on the original series with seasonal period 96(number of 15min periods per day) kills my Jupyter notebook kernel.

  2. statsmodels.tsa.seasonal.seasonal_decompose complains about 15 minute frequency, specifically "freq T not understood".

  3. residuals from subtracting trends obtained by running bkfilter, hpfilter, and cffilter are not stationary.

  4. I tried to model $y_t = \alpha * y_{t-96} + \beta * y_{t-96*7} + ARMA(p,q)$ by passing lagged values to ARIMA as exogenous variables, but got a warning that Likelihood optimization failed to converge, and for some p, q values, coefficients of AR or MA coefficients turned out to be nan's.

  5. periodogram() from statsmodels.tsa.stattools gave me the highest frequencies at periods 62 and 9, but I do not know how to compute fourier coefficients and estimate the trend.
    I am also concerned that spectral analysis approach is overfitting my data – do I get the same periods if I take a different three month of readings?

Best Answer

This is a case of . The tag wiki has links to resources. The go-to solution is to model the seasonalities using multiple sinusoidal harmonics, then model residuals using ARIMA, possibly all with a Box-Cox transformation for good measure. This is known as a TBATS model. The forecast and fable packages for R contain functions for this, but I am not aware of any related functionality in Python. You may need to implement the algorithm per De Livera, Snyder & Hyndman (2011) yourself, or at least let yourself be inspired by the paper.