Question #1:
The problem is that in the MLE case, both the Python (statsmodels) and R procedures use state-space models to estimate the likelihood. In an SARIMAX class, the state-space grows linearly (or worse) with the number of seasons (because the state-space form incorporates all intermediate lags too - so if you have a lag at 3600, the state-space form also has all the 3599 intermediate lags).
So you now have a couple of issues - first, you're multiplying 3600+ matrices by each other, which is slow. Even worse, state space models need to be initialized and often they are by default initialized using a stationary initialization that requires solving a 3600 linear system. When I tested a 3600 seasonal order, it wasn't even getting past this part.
The R arima function accepts method='CSS' which uses least-squares (conditional MLE instead of full MLE) to solve the problem. Depending on how the arima function works, it could be much better in your case.
In Python, there aren't many good options. The SARIMAX class accepts a conserve_memory
option, but if you do that, you can't forecast. To solve the initialization problem, you can call the initialize_approximate_diffuse
method to avoid the 3600 linear system solving. However, even in these cases, you'll be multiplying 3600 x 3600 matrices together, which will be quite slow. I would like to update the SARIMAX class to work with sparse matrices (which would solve this problem) but that's probably quite a ways in the future. I don't know of any non-commercial program that implements state space models using sparse matrices.
Question #5:
This was a bug in the statsmodels code. It has been fixed in the repository (see https://github.com/ChadFulton/statsmodels/issues/2)
First off, your understanding of period/cycle length is correct.
For hourly web traffic, I would expect the strongest seasonalities for cycles of length 24h (daily seasonality) and 168h (weekly seasonality - typically weekdays and weekends differ strongly). For instance, here is traffic to CV.
You have already found our multiple-seasonalities tag. Its tag wiki contains pointers to literature and commonly used algorithms to model this. My impression is that most such specialized models are implemented in R, rather than Python, but I may be wrong about this.
STL and ARIMA can only deal with a single seasonal period. Depending on your data, you may learn a little something from applying these models separately with the two seasonalities. (E.g., a very simple forecast would be to fit separate ARIMA models for the two seasonalities, then average the forecasts.)
Best Answer
This is unspeakably true. This is the point where I upvoted your question.
The excellent free online book Forecasting: Principles and Practice (2nd ed.) by Athanasopoulos & Hyndman gives a number of very simple methods which are often surprisingly hard to beat:
These and similar methods are also used as benchmarks in academic forecasting research. If your newfangled method can't consistently beat the historical average, it's probably not all that hot.
I am not aware of any Python implementation, but that should not be overly hard.