Birthday Calendar Gaps

combinatoricsprobability

I work at a company that posts a birthday calendar. I noticed that there was a string of four consecutive days with no birthdays. What is the probability of that happening?

Problem Statement

Given $n$ people, what is the probability of a observing a birthday calendar with no gaps of length $g$ or greater.

In my case $n = 400$ and $g = 4$. I'm mostly interested in an analytical solution.

Partial Solution

We will count the number of birthday assignments that have gaps less than $g$.

To do this, we will count assignments which have exactly $d$ distinct birthdays ($d = 1, 2, 3, …, 365$) and sum over $d$.

For a given $d$, we will require a counting of two things:

  1. Number of ways to partition $n$ birthdays among $d$ days.
  2. Number of ways to select $d$ days from the year with no gaps of $g$ or greater.

I found a solution to 1: $S(n,d) \times d!$ where $S(n,d)$ is a Stirling Number Of Second Kind. See solution here:

Consecutive birthdays probability

I need help on 2.

Best Answer

For each day, $d$, let $E_d$ be the event that there is a birthday on day $d$, but there are not any birthdays on days $d+1,d+2,\dots,d+g$. A gap of length $g$ occurs if and only if $E_d$ occurs for some $d$. That is, $$ P(\text{no gaps of length $g$})=P(E_1^c\cap E_2^c\cap \dots\cap E_{365}^c) $$ To compute this, we use the principle of inclusion exclusion: $$ P(\text{no gaps of length $g$})=\sum_S(-1)^{|S|}P(E_{d(1)}\cap E_{d(2)}\cap \dots \cap E_{d(k)}) $$ where $S=\{d(1),d(2),\dots,d(k)\}$ ranges over all $2^{365}$ subsets of days.

We must figure out the probabilities of the intersections $E_{d(1)}\cap E_{d(2)}\cap \cdots \cap E_{d(k)}$. If any of the intervals $[d(i),d(i)+g]$ and $[d(j),d(j)+g]$ overlap, then this probability of this intersection is zero; the $E_d$ were defined carefully so this would be true. Otherwise, we use the principle of inclusion exclusion on this smaller problem to compute $$ p_k:= P(E_{d(1)}\cap \cdots\cap E_{d(k)}) = \sum_{j=0}^k(-1)^j\binom{k}j\left(1-\frac{kg+j}{365}\right)^n $$ Finally, we must count for each $k$ the number of ways to choose $\{n(1),\dots,n(k)\}\subseteq \{1,2,\dots,365\}$ so the intervals $[n(i),n(i)+g]$ are pairwise non-overlapping. I claim this number is $$ n_k=\binom{365-gk}{k}+g\binom{365-gk-1}{k-1} $$ I leave it to you to verify this is correct. As a hint, the first summand counts choices where none of the gaps cross between two different years, and the second counts ones that do.

We finally get that $$ P(\text{no gaps of length $g$})=\sum_{k=0}^{\left\lfloor \frac{365}{g+1}\right\rfloor }(-1)^{k}n_kp_k $$