Fruit Tree Paradox: Sum of disjoint probabilities not equalling $1$

geometric seriesparadoxesprobabilityrecreational-mathematics

This is a probability question that I came up with, and have noticed some things that do not seem to be right. Here's a description of the question:

Suppose we have a fruit tree growing in our garden, which grows from scratch in steps (stages). At each stage, there are three equally likely components of the tree that could appear: a single branch, a double branch, or a fruit. A branch's growth is independent, and any one branch will stop growing only if a fruit appears in its last stage of growth. A double branch means that one branch turns into two branches. A single branch is sort of redundant, as in it only delays the growth of a fruit, which is not of our concern.

Denote the probability that we end up with exactly $n$ fruits by $f(n)$. What is $f(n)$?

I did derive a formula for $f(n)$, but I feel it is wrong. I first noticed that $f(0)=0$ as the tree won't stop growing until there is at least one fruit. Some initial values of $f(n)$: $$f(1)= \sum_{i=0}^{\infty} P(i \ \text{single branches and one fruit})=\frac 13+\frac 13\cdot\frac 13+\frac 13\cdot\frac 13\cdot\frac 13 \cdots =\frac 12 $$
By symmetry, the probability of occurrence of a double branch after however many number of single branches will also be $\frac 12$.

$$f(2)= P(\text{double branch})\cdot [P(\text{fruit on one of the branches of the double branch})]^2\\=\frac 12\cdot \left(\frac 12\right)^2 =\frac 18$$

For the general case, the following recursion is obtained: $$f(n)= P(\text{double branch})\cdot P(\text{fruit})\cdot f(n-1) \\ \implies f(n)=\frac{1}{2^{2n-1}}$$

The problem is, when I sum $f(n)$ I'm not getting the expected result: $1$. $$\sum_{n=1}^{\infty}f(n)= 2\sum_{n=1}^{\infty}4^{-n} =\frac 23\overset{?}{\ne} 1$$
What is happening here? Any pointers to where I'm wrong will be highly appreciated. Thanks.

enter image description here

Best Answer

Your recursion is wrong because you’re only accounting for the possibility that one particular branch bears $1$ fruit and the other bears $n-1$. This happens to yield the right result for $n=2$ because in that case $(1,1)$ is the only possible combination of fruit counts. But $n=3$ can be realized as $(1,2)$ and $(2,1)$, so the probability is twice what you calculate, and beginning with $n=4$ both branches might bear more than one fruit. The correct recursion is

$$ f(n)=P(\text{double branch})\sum_{k=1}^{n-1}f(k)f(n-k)\;. $$

Multiplying this by $P(\text{double branch})$ and defining $g(n)=P(\text{double branch})f(n)$ yields

$$ g(n)=\sum_{k=1}^{n-1}g(k)g(n-k)\;, $$

with $g(1)=\frac14$. Here are the first few values:

\begin{array}{c|c|c} n&g(n)&f(n)\\\hline 1&\frac14&\frac12\\ 2&\frac1{16}&\frac18\\ 3&\frac1{32}&\frac1{16}\\ 4&\frac5{256}&\frac5{128} \end{array}

Contrary to what has been stated in the comments, the probabilities should sum to $1$, since the average number of descendants of each branch is $1$ and thus the extinction probability for the branches is $1$ (see e.g. Wikipedia). The statement in the comment that the expected number of fruits is infinite is correct; this does not imply that the probability to obtain infinitely many fruits is non-zero. It would, however, be non-zero if you very slightly tweak the probabilities such that each branch generates more than one branch on average.

Edit:

Actually, the problem can be mapped to a simple symmetric random walk in one dimension. As you noted, you can ignore the single branches, since they’re just a delay. So the probability to eventually get a fruit or a double branch is $\frac12$ each. It doesn’t matter in which order we process the branches; all that matters is the number of active branches. So with probability $\frac12$ you decrement the number of active branches, and with probability $\frac12$ you increment it; so the number of active branches is a simple symmetric random walk starting at $x=1$. Each decrementing step produces a fruit, so the number of ways to produce $n$ fruit in $2n-1$ steps from $1$ to $0$ without hitting $0$ in between is the $(n-1)$-th Catalan number $C_{n-1}$, and the corresponding probability is $2^{-(2n-1)}C_{n-1}$, in agreement with the above table and recurrence relation.

Related Question