[Math] what’ is the number of full subtrees of a full binary tree

algorithmscombinatoricscomputer sciencetrees

I'm looking for the number of full sub-trees of a binary tree; all possible tress of height less than $4$ are:


enter image description here


Now my question is:

What is $N(h)$ the maximum number of full sub-trees of a full binary tree of height $h$?

To understand what I mean, consider $H(t)$ the number of full subtrees of the binary tree $t$, let's take an example :

  • If $t$ is the tree in the figure above with $h=3$ named $(a)$, then $H(t)=4$: sub-trees of type $h=0$ , sub-tree of type $h=1$, sub-tree of type $h=2\, (a)$ and finally one of type $h=3\, (a)$ (or itself), the most important observation is the fact that we count just the types (not the redundancy of the element).

More example :

  • $H(h=2,(c))=3$
  • $H(h=3,(d))=4$
  • $H(h=2,(a))=3$
  • $H(h=3,(e))=4$

Using this we can deduce $N(0)=1$, $N(1)=2$,$N(2)=3$ and $N(3)=4$

Best Answer

Firstly, (c) should be in the $h=2$ row and you are missing a few trees in the $h=3$ row; there should be 21 of them. For the number of full binary trees of a certain height see OEIS A001699.

Now to find $N(h)$. Counting by subtrees is hard and there is hardly any literature on the matter. It's hard because the left and right principal subtrees are not independent. The natural recursion involves taking the union of their subtree-sets. Before you read further, you should check out the related thread. I will relate to a few things from its accepted answer.

A binary tree $\tau$ can be compacted into a unique directed acyclic graph where each node represents a distinct subtree in $\tau$. Now you can represent these DAGs in a canonical form, allowing you to define their "shape". In this sense, the "height" of the DAG corresponds to the height of the tree. For example,

$\qquad \qquad \qquad \qquad \qquad \qquad \qquad \qquad$ enter image description here

is a possible shape that contains 11 nodes and has a height of 6. Thus it represents a tree of height 6 with 11 subtrees. To find $N(h)$, we need to make the DAG as wide as possible. A node in layer $\ell$ of the DAG represents a tree of height $\ell$, so layer $\ell$ can contain at most $T(\ell)$ nodes. Also, by the canonical definition of the DAGs, the width of the layers can only grow {1,2,4,8,...} from the top down. These two bounds are tight (by the algorithm in the other thread), and can be used to compute $N(h)$. For example, the first few values $(h, N(h), \text{shape})$ are

0 1 [1]
1 2 [1, 1]
2 3 [1, 1, 1]
3 5 [1, 1, 2, 1]
4 8 [1, 1, 3, 2, 1]
5 12 [1, 1, 3, 4, 2, 1]
6 20 [1, 1, 3, 8, 4, 2, 1]
7 36 [1, 1, 3, 16, 8, 4, 2, 1]
8 57 [1, 1, 3, 21, 16, 8, 4, 2, 1]
9 89 [1, 1, 3, 21, 32, 16, 8, 4, 2, 1]
10 153 [1, 1, 3, 21, 64, 32, 16, 8, 4, 2, 1]

The first 200 values of $N(h)$ are listed here.