Time Series – Effective Strategies for Forecasting Thousands of Time Series Models

ensemble learningforecastingmodel-evaluationpythontime series

I'm currently immersed in a challenging forecasting project centred around predicting the required work hours to complete various tasks within a team setting. My dataset comprises crucial attributes, including team IDs, task IDs, hours, and dates. Specifically, I'm working with a comprehensive dataset containing distinct time series information for each unique combination of teams and tasks, resulting in approximately 8000 distinct time series. I aim to construct a robust forecasting model tailored to this intricate scenario.

Amidst this endeavour, I've encountered several complexities. These include the diversity in time series lengths, the presence of both new and well-established teams and tasks, varying from 3 months to 2 years of data, and the potential incompleteness or gaps within the time series.

To provide deeper insight into the dataset's dynamics, each team is associated with a set of tasks, such as "call customer", "draft email", and "follow up with a client". The team members record daily time entries for these tasks, collectively contributing to the intricate web of time series data. The primary objective of my forecasting model is to predict future work hours based on historical observations, facilitating informed planning and decision-making by team leads.

Below are the first ten rows of the dataframe, sorted by Date, TeamID and TaskID
Dataframe Top 10 rows

You can use the below code to generate the sample data. Note that actual data is much more noisy and also has a gap in time series

import pandas as pd
import random
from datetime import datetime, timedelta
 
# Create date range
start_date = datetime(2022, 1, 1)
end_date = datetime(2023, 12, 31)
date_range = pd.date_range(start=start_date, end=end_date, freq='D')
 
# random data for teams and tasks
teamsID = [11, 12, 13]
tasksID = [1, 2, 3]
 
data = []
 
for team in teamsID:
    for task in tasksID:
        task_data = []
        current_date = start_date
        while current_date <= end_date:
            task_data.append({
                'Date': current_date,
                'Team': team,
                'Task': task,
                'Hours': random.randint(1, 8)  # Random hours
            })
            current_date += timedelta(days=1)
        data.extend(task_data)
 
# Create DataFrame
df = pd.DataFrame(data)

In my pursuit of a scalable approach, I'm exploring the following strategies to enhance forecasting accuracy:

  1. Utilizing a diverse ensemble of time series models (e.g., Prophet, ARIMA) to boost forecasting precision, coupled with time series ensemble techniques for prediction aggregation (as detailed here). The resultant ensemble models would be saved for future predictions.

  2. Tuning hyperparameters for individual models to optimize their predictive capabilities.

  3. Conducting rigorous machine learning experiments for each ensemble model to identify the most effective models based on varying hyperparameter configurations.

  4. Evaluating model performance using the Mean Absolute Percentage Error (MAPE) as the chosen evaluation metric. However, tasks with zero hours for certain days pose challenges to accurate MAPE calculation.

  5. I'm considering applying stationarity tests to comprehend the data more deeply. However, I'm seeking guidance on effectively scaling this approach to encompass the complexity of the 8000-time series.

Despite these strategies, the challenge of scalability remains. For instance, training six models (ARIMA, ETS, BATS, TBATS, Prophet, and XGBoost) for each of the 8000 time series equates to a staggering 48,000 iterations without considering hyperparameter tuning. Given the substantial computation requirements, the practical feasibility of tracking these experiments using open-source MLOps tools like MLFlow is also a concern.

I'm reaching out for valuable insights and guidance. I'm keen to learn about best practices, practical approaches, and any available resources that can help me navigate this challenging endeavour.

Thank you in advance for your expertise and assistance.

Best Answer

First off, do not look to standard time series forecasting algorithms. These presuppose exactly one observations per time bucket, e.g., per day, week or month (and this observation may be "zero"). What you have, in contrast, is zero or multiple observations per time bucket. In addition, standard forecasting methods expect the time series to be continuous, but you may well have "holes" in the series where some teams do not work on some tasks or task types.

Instead, I would use standard regression models, "regression" being in the Machine Learning sense: predicting a numerical output. Just feed in your predictors and build models as usual.

If you suspect time dynamics, model these. Maybe your teams are less productive on Fridays? Then feed in Boolean dummies for day of week. Perhaps they are less productive during summer? Feed in a Fourier transform of the day of year. Possibly start with a multiple linear regression as a benchmark before trying more complex methods.

Think about what all those zeros in Hours are: did people really finish a task in zero time, or is that really a missing piece of information, or was the task open and they did not work on it that day? As always, understanding your data is usually much more important than tweaking models. You may want to look at zero-inflated models.

In your question, you show TeamID and TaskID. I hope you actually have task (and/or team) features so you can actually predict something, because TaskID sounds like an ID that was used for one task and will therefore not be used again - so you would not be able to forecast for a new TaskID. But again, this is standard ML.

Finally, the MAPE has major shortcomings, especially if we have zeros, whose treatments makes quite a difference. Either use it as an objective measure, or if you use a "standard" loss function like MSE or likelihood, you may want to post-process predictions to find the point prediction that minimizes the expected MAPE. Actually, I have never seen a business problem that was better solved using a MAPE-optimal forecast rather than an MSE-optimal forecast.

Related Question