Solved – Forecasting Amazon or Netflix demand

forecastingtime series

Suppose I want to predict Amazon or Netflix demand, using demand data over the past year. For example, I might want to forecast the number of sales in the Electronics category on Amazon, or the number of times someone wants to rent Titanic on Netflix. My dataset consists of daily demand per item over the past couple of months, along with item metadata (tags and categories), split by things like customer demographics (age group, gender, location, browser, job — some of these might be unknown).

To be concrete, let's suppose I want to forecast the number of times someone wants to rent a Comedy on Netflix, and I want to make this forecast at various levels (e.g., overall, by the state the customer lives in, by male/female, etc.). How would I go about this?

My naive first thought is to form a time series at each level I care about (e.g., form a time series of comedy demand by all the males living in Florida), and build some kind of time series model on top of this (I guess an ARIMA model…?). But this seems wrong for a bunch of reasons (not only would I be building a ton of different models for all the different possible levels, but each level would be ignoring a lot of data from closely related levels).

Any suggestions? Surprisingly, I couldn't find any papers related to this problem when Googling, but I might just be using the wrong search terms. (I learned a smidgen of time series analysis a couple years ago, but I was incredibly bad at it.) Also, I'm interested in both methods (what algorithms to use) and particular statistical libraries that might be useful (e.g., R packages or Python libraries).

Best Answer

If you do a good enough job modeling the important predictor variables, you probably will not need to worry as much about the time series aspects (You should probably still test for serial correlation and adjust for it if needed).

Most of the times series style association you will see can easily be modeled by things like day of the week, holiday/vacation indicators, and time since the dvd release or some form of advertising or event that spurs rentals of a particular movie.