Tree: number of children of each node given total number of nodes and number of sublevels

algebra-precalculusdiscrete mathematicsgraph theorytrees

Imagine a full n-tree: every node has a single parent (or is the root and has 0 parents) and exactly n children (or is a leaf node and has 0 children).

We can describe this tree with three variables: A = total number of nodes in the tree. B = number of sublevels in the tree. C = number of children per node.

The fundamental formula to relate these three variables is most intuitively:

$$ A = \sum_{n=0}^{B}{C^n} $$

For example, if there are 4 sublevels of hierarchy (under the top node) and each node has 3 children, then B = 4 and C = 3. The top layer's count is 1, then the second layer's count is 3, then 3^2 = 9, then 3^3 and 3^4. The sum equals 121.

From some Googling, I discovered that this sum can be expressed as:

$$ A = \frac{{C^{B+1}-1}}{{C-1}} $$

If given A and C, you can work it out to solve for B and end up with:

$$ B=log_C(AC-A+1)-1 $$

However, I am looking for a way to solve for C given A and B. Is there a way to do this?

Best Answer

Found the answer. First, some renaming of the variables for clarity:

  • Depth of tree = D (was previously B)
  • Node count = N (was previously A)
  • Children per node = C (same as before)

Next for some trivial edge cases:

$$D=0 \leftrightarrow N=1$$ $$D=1 \leftrightarrow N=C+1$$ $$C=1 \leftrightarrow N=D+1$$

So, we will only consider cases where C >= 2 and D >= 2.

Divide all the nodes between leaf nodes L and non-leaf (parent) nodes P. L + P = N. So,

$$L=C^D$$ and $$P=\sum_{n=0}^{D-1}{C^n}$$

Therefore,

$$C=\sqrt[D]L$$

It is obvious that

$$\sqrt[D]L < \sqrt[D]{L+P}$$

But it is a property of full trees where C > 1 that there are more leaf nodes than parent nodes. Because L > P and D >= 2,

$$\sqrt[D]{L+P} < \sqrt[D]{L+L} \leq \sqrt[D]{L*D} = \sqrt[D]L+1$$

So putting them together,

$$\sqrt[D]L < \sqrt[D]{L+P} < \sqrt[D]L+1$$

And because C is an integer,

$$floor(\sqrt[D]{L+P})=\sqrt[D]L$$

Intuitively, this means that all parent nodes are not enough to change the order of magnitude of the tree. The leaf nodes are sufficient to determine this. After substituting,

$$C=floor(\sqrt[D]N)$$

So that is the answer. It also nicely works for the edge case of C=1, but not for D=0 or D=1.

All credit goes to my cousin for figuring this out.