Solved – How to model football (soccer) scores

poisson distributionprobability

I'm trying to model football scores using the poisson distribution. I'm calculating a lambda for each team and then multiplying the probabilities of two poisson calculations to get the probability of a specific score.

e.g. poison(lambdaHome, 1)*poisson(lambdaAway, 2) to calculate the probability of a 1-2 result at the end of the match.

The problem I'm having is that the poisson distribution assumes that the second goal is just as likely as the third goal is just as likely as a fourth goal. But in reality this shouldn't be true. If a match has a score of 2-0 then intuitively it is less likely for the team to score another goal is it would probably spend more time on defence to keep the score than it would do attacking to score more goals.

Is there a probability distribution similar to poisson, but rather than having a constant probability for all goals, makes each goal less and less likely?

Best Answer

The state of the art for football prediction - that can be found in the academic literature - is Dixon and Robinson (1998) "A Birth Process Model for Association Football Matches", available without a pay wall here: http://www2.imperial.ac.uk/~ejm/M3S4/Problems/football.pdf

Their model of interacting non-homogeneous Poisson processes accounts for these two key phenomena which independent Poisson distributions cannot:

  • Rising goal rates. More goals are scored at the end of matches than at the start
  • Scoreline dependent goal rates. Teams have a tendency to be motivated or demotivated depending on the scoreline. Such effects are different for the home and away teams

It is this second point which you are referring to whereby teams with a comfortable lead have a drop off in their rate of scoring goals.