# 7 Game Playoff Series – Statistical Insights

gamesoddsprobability

Background: a friend of mine makes a hobby (as I imagine many do) of trying to predict hockey playoff outcomes. He tries to guess the winning team in each matchup, and the number of games needed to win (for anyone unfamiliar with NHL hockey a series is decided by a best of 7). His record this year after 3 rounds of play (8+4+2=14 best of 7 matchups) is 7 correct/7 incorrect for winning team and 4 correct/10 incorrect for number of games (he only considers the number of games correct if he also picked the winning team).

We got to joking that he's doing no better than blind guessing on the teams question, but that he's substantially beating the odds if one assumes that the probabilities for a 4, 5, 6 or 7 game series are equal (would expect a 12.5% success rate, he's at 28.5%).

This got us wondering what the odds actually are for each possible number of games. I think I've worked it out, but I want to tie up a few loose ends since part of my approach was brute-force scribbling on a big piece of paper. My basic assumption is that the outcome of every game is random with probability $\frac{1}{2}$ for a each team to win.

My conclusion is that:

$$\rm P(4\;games) = \frac{2}{2^4} = 12.5\%\\ P(5\;games) = \frac{8}{2^5} = 25\%\\ P(6\;games) = \frac{20}{2^6} = 31.25\%\\ P(7\;games) = \frac{40}{2^7} = 31.25\%$$

I guided my analysis based on a notion that a 4 game series should have a probability of $\frac{2}{2^4}$, analogous to the odds of flipping 4 coins and getting either 4 heads or 4 tails. The denominators were easy enough to figure out from there. I got the numerators by counting the number of "legal" combinations (WWLWWLL would be illegal since the series would be decided after 5 games, the last 2 games would not be played) of results for a given number of games:

Possible 4 game series (2):
WWWW LLLL

Possible 5 game series (8):
LWWWW WLLLL
WLWWW LWLLL
WWLWW LLWLL
WWWLW LLLWL

Possible 6 game series (20):
LLWWWW WWLLLL
LWLWWW WLWLLL
LWWLWW WLLWLL
LWWWLW WLLLWL
WLLWWW LWWLLL
WLWLWW LWLWLL
WLWWLW LWLLWL
WWLLWW LLWWLL
WWLWLW LLWLWL
WWWLLW LLLWWL

Possible 7 game series (40):

LLLWWWW WWWLLLL
LLWLWWW WWLWLLL
LLWWLWW WWLLWLL
LLWWWLW WWLLLWL
LWLLWWW WLWWLLL
LWLWLWW WLWLWLL
LWLWWLW WLWLLWL
LWWLLWW WLLWWLL
LWWLWLW WLLWLWL
LWWWLLW WLLLWWL
WLLLWWW LWWWLLL
WLLWLWW LWWLWLL
WLLWWLW LWWLLWL
WLWLLWW LWLWWLL
WLWLWLW LWLWLWL
WLWWLLW LWLLWWL
WWLLLWW LLWWWLL
WWLLWLW LLWWLWL
WWLWLLW LLWLWWL
WWWLLLW LLLWWWL


What's a non-brute-force method for deriving the numerators? I'm thinking there may be a recursive definition, so that $\rm P(5\;games)$ can be defined in terms of $\rm P(4\;games)$ and so on, and/or that it may involve combinations like $\rm(probability\;of\;at\;least\;4/7\;W)\times(probability\;of\;legal\;combination\;of\;7\;outcomes)$, but I'm a bit stuck. Initially I thought of some ideas involving $\left(^n_k\right)$ but it seems that only works if the order of outcomes doesn't matter.

Interestingly, another mutual friend pulled out some statistics on 7 game series played (NHL, NBA, MLB 1905-2013, 1220 series) and came up with:

4 Game Series - 202 times - 16.5%
5 Game Series - 320 times - 26.23%
6 Game Series - 384 times - 31.47%
7 Game Series - 314 times - 25.73%


That's actually a pretty good match (at least from my astronomer's point of view!). I'd guess that the discrepancy comes from the outcome of each game having being biased toward a win for one team or the other (indeed, teams are usually seeded in the first round so that the leading qualifying team plays the team that barely qualified, second place plays second last, and so on… and most of the games are in the first round).

For a team to win [the series] in game N, they must have won exactly 3 of the first N-1 games. For game seven, there are $\binom{6}{3} = 20$ ways to do that. There are 2 possible outcomes for game seven, and 20 possible combinations of wins for each of the teams that can win, so 40 possible outcomes. For an N-game series a best-of-seven series to end in N games, the number of possibilities is $2 \binom{N-1}{3}$.