Solved – Predicting Next Likely Outcome of Binary Time Series

binary datapredictionprobabilitytime series

I'm trying to approach the following problem: Danny & Johnny are professional basketball players. Each day they meet, and play for a while. Whoever scores the most points is declared winner for the day, and wins a dollar. After a year of games, we are given a vector (including 365 elements) representing, say, Danny's winning days:

[1, 0, 0, 1, 1, 0, 0, …….]

Given the vector of prior 365 'basketball matches' between the two players, what methods can one use to try and predict the winner of the next (366th) game?

Would greatly appreciate any insights / suggestions.

Best Answer

Your question seems to concern the inductive bias that you have to make in order to be able to predict anything meaningful at all. Without assumptions there is nothing you can learn (no free lunch theorem). So what do you assume?

Probably something like this: both players have their own skill level that is constant over the whole year, and whenever they play, the outcome is independent of all previous games. Then you can think of this as a Bernoulli experiment (flipping a coin), biased towards the better player. So, simply check who of the both players won more often, and then always predict him. However, probably you also assume that their skills vary over time? Then you should ignore the first 11 months, and always predict the better of both from the last 30 days - or 3 months? You can also take the approach described by Huanaphu, where the time window length of the dependency on the past is implicitly modelled by the choice of the resolution. However, probably you further suppose that there are even other effects with even less independence between the games? (e.g., whenever Danny looses, he will be extremely motivated the single next time and then win with much higher probability). Now you need a more advanced model that conditions your prediction on the outcome of the last game.

So, first, think carefully about your model assumptions, second, use standard techniques that fit your model in order to predict the "most likely" outcome.

Related Question