Given a binary tree with N labelled leaves, is it possible to find its unique number in the Catalan range

catalan-numberscombinatoricsgraph theoryplanar-graphstrees

The question is about finding the inverse to the problem of generating the $n^{th}$ binary tree with N labelled leaves (Generating the $n^{th}$ full binary tree over $N$ labelled leaves).

Let's say if $N = 4$, the possible set of trees are

1: (((1, 2), 3), 4)
2: (1, ((2, 3), 4))
3: ((1, (2, 3)), 4)
4: (1, (2, (3, 4)))
5: ((1, 2), (3, 4))

If I choose a specific tree from this set, let's say $(1, ((2, 3), 4))$, is there an algorithm that gives me back the value 2? The Catalan range for the problem is 1 to 5 and the unique number corresponding to the given tree is 2.

What do I mean by Catalan Range?

If there are N leaf nodes, the maximum possible binary trees is $C(N-1)$. For a given $n$, its $C(n)$ is the $n^{th}$ Catalan number. We can uniquely identify all the individual binary trees if we assign them a number from $1$ to $C(N-1)$ in order. I'm referring to this range of numbers from $1$ to $C(N-1)$ as the Catalan Range.

What scheme am I using to order the trees from $1$ to $C(N-1)$?

I don't really mind the scheme used to order the trees as long as all the trees can be uniquely identified within that scheme. For example,

$1$: The tree having just one node in the left sub tree and $N-1$ nodes in the right sub tree.
$2$: The tree still having just one node in the left sub tree and with a slightly different right sub tree now.
.
.
$C(N-1)$: The tree having $N-1$ nodes in the left sub tree and one node in the right sub tree.

For the sake of making it easier to discuss the answer, we could follow the scheme as decided by the first answer in this page.

Best Answer

Let $f$ be the function mapping full binary trees to integers; I'll use the convention that binary trees with $n$ leaves will map to the range $\{0, 1, \dots, C_{n-1}-1\}$ because that's easier to work with in the recursion. You can add $1$ later.

If we have a binary tree $T$, let $L$ be the "left" subtree: the subtree with leaves $1, 2, \dots, k$ for some $k$. Let $R$ be the "right" subtree: the subtree with leaves $k+1, k+2, \dots, n$. We will find $f(T)$ in terms of $f(L)$, $f(R)$, and $k$ where for the purposes of finding $f(R)$ we relabel $R$ to have leave $1, 2, \dots, n-k$.

Our trees are labeled in increasing order of $k$. So before this tree, we have $$ C_0 C_{n-2} + C_1 C_{n-3} + \dots + C_{k-2} C_{n-k} $$ trees whose left subtree has $1, 2, \dots, k-1$ leaves respectively.

Next, before this particular left subtree $L$, there are $f(L)$ previous left subtrees on $k$ leaves; for each one of them, there are $C_{n-k-1}$ right subtrees. All $f(L) C_{n-k-1}$ of the combined $n$-leaf trees go before $T$.

Finally, there are $f(R)$ trees with the same left subtree, but with a right subtree preceding $R$; these also go before $T$.

Altogether, we get the recursion $$ f(T) = \sum_{i=1}^{k-1} C_{i-1} C_{n-i-1} + f(L) C_{n-k-1} + f(R). $$ The base of the recursion sets $f(T) = 0$ when $T$ has just one or two leaves, in which case there's only one possible tree. (Actually, we only need the one-leaf case as our base case.)

Related Solutions

[Math] Number of nodes in binary tree given number of leaves

Your formula only works if you assume all the leaves are the same depth in the tree and every node that isn't a leaf has 2 children (see wikipedia for different kinds of binary trees). For example imagine a tree

o
 \
  o

This has $n=1$ leaves and 2 nodes but the formula gives $2n-1=1$.

Making this assumption, to prove by induction, notice (1) that the formula holds true for a tree of height 1 with 1 node, because $2\times 1 - 1 = 1$.

Then (2) assume that the formula holds for trees with $k$ leaves, so assume trees with $k$ leaves have $2k-1$ nodes. Adding another level to the tree with $k$ leaves adds another $2k$ leaves because each leaf in the original tree has 2 children. So this new tree has a total of $2k-1$ leaves from the original plus another $2k$ leaves = $4k-1$ leaves. The formula for $2k$ leaves gives $2(2k)-1=4k-1$ leaves, which is the same!

So because our (1) our base step is true; and (2) our inductive step is true, then the formula is true for all $n$ (subject to the constraint above).

Alternatively, the depth $d$ of the tree is $\log_2 n +1$ because the number of nodes doubles at each level.

The total number of nodes is $1 + 2 + 4 + \ldots$ with $d$ terms in the sum, or $2^0 + 2^1 + 2^2 + \ldots + 2^{d-1}$. By the geometric series formula the sum is equal to

$$ \begin{align}\frac{2^d -1}{2-1} &= 2^{\log_2 (n+1)} - 1 \\&= 2\times2^{\log_2 n} -1\\&= 2n -1\end{align}$$

[Math] In binary tree, number of nodes with two children when number of leaves is given

Hint:

For any tree: $|E| = |V|-1$.
For any graph $2|E| = \sum_{v \in V} \deg(v)$.
A vertex is a leaf if and only if it's degree is $1$.
Except for root, the two-children nodes have degree $3$.
Intuition: start with a path (each vertex has degree 2, except for two leaves at the ends); now, each time you change a vertex from degree 2 to degree 3, you have make some other vertex of degree 2 into degree 1, so that the sum of degrees is constant.

I hope this helps $\ddot\smile$

Best Answer

Related Solutions

[Math] Number of nodes in binary tree given number of leaves

[Math] In binary tree, number of nodes with two children when number of leaves is given

Related Question