Solved – Calculating Total Probabilities from a Poisson Random Variable

poisson distributionprobability

First time post and I'm not sure if I'm posting in the right place but here goes:

I enjoy Sports and was reading a book over Christmas (Mathletics by Wayne Winston) and to cut a long story short have started ranking sports teams based on least Squares Regression (based on the number of points scored or conceded) to generate Power Rankings. I've also done a little bit of reading around the subject such as on "The Probability of Winning a Football Game (Stern, The American Statistician, Aug 1991).

I've noted that this works out fairly well in predicting results with some leagues (based on margins of error) and has led me to look at occurrences to do with probabilities of scoring goals. Below is a sheet that I have pulled from one of my Excel tables for a given league. The means are based on my least squares data from previous results, used to predict the likely number of goals based on a teams power rankings (offensively and defensively). The probability of scoring a goal I have treated as a Poisson Random Variable and have used the Poisson function in Excel to generate the numbers below:

mean    3.07554771 3.389358782  
goals   team 1          team 2          team 1 wins 
0   0.04616434      0.03373030  
1   0.14198062  0.11432408  0.00478905 
2   0.21833409  0.19374267  0.03232532 
3   0.22383230  0.21888780  0.07650522 
4   0.17210173  0.18547233  0.09649483 
5   0.10586142  0.12572645  0.07898926 
6   0.05426364  0.07102201  0.04731158 
7   0.02384149  0.03438844  0.02248027 
8   0.00916570  0.01456934  0.00895759 
9   0.00313217  0.00548675  0.00310669 
10  0.00096331  0.00185966  0.00096076 
CUMULATIVE  1.00    1.00    
tie 0.159139717     
team 1 wins 0.371920564     
team 2 wins 0.468939719     

(sorry for the layout – I haven't got the hang of the formatting on here yet when pasting from Excel)

My query is could I use this to predict the probability of an over under (e.g. over/under 5.5 total goals for instance)? I've got this far and had a total brain fade (high school statistics was a long while ago and the topics we learnt were never that interesting or relevant at the time). For what it's worth I've been able to interpret the means fairly well against totals (64% for this league) and I am just looking at ways of refining this (e.g. putting a probability out there as opposed to in this case 6.45, is over 5.5, I'd go over). I find it is always fun to have an opinion, but it's better when you can back it up with a statistics.

Best Answer

If you believe the scores are well defined by the poisson distribution and that they are independent, you could quickly simulate the probability of certain spreads. Here's some simple r-code that runs 10,000 simulations and finds how often team b wins by 5.5+ points:

mean( (rpois(n=10000,lambda=3.0755) + 5.5) < rpois(n=10000,lambda=3.3894) )

I get about a 2% chance. I know nothing of this domain (sports stats) so you might want to closely visit that independence assumption. You could go deeper and look at alternative discrete distributions like the negative binomial (assuming heterogeneous player contributions?) Or go baysian and treat your lambda as a random variable to capture more of your uncertainty...

Related Question