Solved – Forecasting A Noisy Time Series

I have daily time series data of the number of views that a YouTube channel has, alongside daily data on the views that the videos in that channel receive.
I would like to predict three months of views given this data and am not sure how to go about this.

I was thinking of the following:

(1) Time series regression after removing trend, and test for the optimum amount of lags. We do not need our forecasts to be incredibly precise, but a realistic view of where a channels views might be in 6 months.

(2) as above, but with the views of a channel and daily views of certain videos. The problem with this approach is that maybe some videos are two weeks old, and are contributing a lot to views, and some are really old and do not contribute much, or have spent their potential views in early years. Our problem with this approach is that we now are working with time series of uneven length, so I am stuck here.

Has any thoughts or ideas?

Also please assume that I know only regression models, but am open to learning anything that will help with the problem above

Best Answer

One possibility would be to forecast each video's views using exponential smoothing. Potentially include seasonality if there is evidence of intra-weekly seasonality. Also include a trend.

Consider modeling and forecasting cumulative views. This will probably increase very quickly at the beginning, and then the increase will taper off. So it would make sense to include a dampened trend.

Alternatively, you could look at regression with ARIMA errors, where you could include "bump functions" to model a video's "lifecycle". Or Bass diffusion models, which explicitly model market growth and saturation.

In addition, you can probably improve your forecasts by leveraging the hierarchical structure. Forecast each video separately, as above. Also forecast the entire channel's views. Then combine the forecasts using the optimal combination approach (see here). This even allows you to use different forecasting methods for the different videos. I have repeatedly found that this approach improves forecasts on all levels of a hierarchy.

Best Answer

Related Solutions

Solved – Forecasting binary time series

Solved – How to determine the type of probability distribution for a dataset

Related Question