Catalan Numbers – Double Grading Explained

catalan-numbersco.combinatorics

This is something I found in trying to work on Vince Vatter's excellent question. I have no solution, but a much more precise conjecture.

Recall that a rooted planar tree is a rooted tree where, for every vertex, the children of that vertex come equipped with an order. I will think of the vertices of tree as members of an asexually reproducing species, and therefore use language like "sibling", "cousin", "child", "parent", "generation" etc. When I speak of generations, I am counting from the oldest, so the root is the fist generation; the children of the root are the next generation, and so forth.
For any vertex, I will think of the ordering of its children as the birth order of the children.

Define a vertex $v$ to be "crucial" if $v$ is the youngest member of its generation, all the other members of that generation are childless, but $v$ has children. For example, in the tree
$$\begin{matrix}
& & a & & \\
& & \downarrow & & \\
& & b & & \\
& \swarrow & \downarrow & \searrow \\
c & & d & & e \\
& & \downarrow & & \downarrow \\
& & f & & g \\
& & \downarrow & \searrow & \\
& & h & & i \\
& & & & \downarrow \\
& & & & j \\
\end{matrix}$$

the crucial elements are $a$, $b$, $f$ and $i$. So the root is always crucial.

Let $c(p,q,n)$ be the number of planar trees on $n$ vertices with $p$ crucial elements and where the root has $q$ children. For example, of the $5$ planar trees on $4$ elements, there is one tree each with $(p,q)$ equal to $(1,2)$, $(2,1)$, $(3,1)$, $(2,2)$ and $(1,3)$.

The number of planar trees, $\sum_{p,q} c(p,q,n)$, is the Catalan number $C_{n-1}$ (item $e$ on Stanley's famous list). Also, $\sum_{p} c(p,1,n)$ is $C_{n-2}$ — this counts planar trees where the root only has one child, so just delete the root to get planar trees on $n-1$ vertices.

Planar trees on $n$ vertices are in bijection with sequences $a_1$, $a_2$, \dots, $a_{n-1}$ of integers such that $a_1 =0$ and $0 \leq a_{i+1} \leq a_i+1$ (item $u$ on Stanley's list). The bijection is to consider $j$ to be in generation $a_j$, with the parent of $j$ being the largest $k$ such that $a_k = a_j-1$ and $k \lt j$. Then put a root at the top in generation $-1$.

The lists of integers in Vatter's question are a subset of lists above. Specifically, they are the ones where the root is the only crucial vertex. (Doug Zare's answer was very helpful in making me realize this description.) So our goal is to show that $\sum_q c(1,q,n)$ is Catalan. After a lot of computation, I came to the following conjecture:

The integer $c(p,q,n)$ only depends on $p+q$ and $n$. The array of integers $c(p+q, n)$ is the Catalan triangle.

I'm bad at indexing, so I'll give the values for planar trees on $10$ vertices. In total, there are $4862$ such trees in total. There are $429$ of them with $(p,q) = (1,2)$ and the same number with $(p,q) = (2,1)$. In total, the number of trees with each value of $(p,q)$ is:
$$\left(
\begin{array}{ccccccccc}
0 & 429 & 429 & 297 & 165 & 75 & 27 & 7 & 1 \\
429 & 429 & 297 & 165 & 75 & 27 & 7 & 1 & 0 \\
429 & 297 & 165 & 75 & 27 & 7 & 1 & 0 & 0 \\
297 & 165 & 75 & 27 & 7 & 1 & 0 & 0 & 0 \\
165 & 75 & 27 & 7 & 1 & 0 & 0 & 0 & 0 \\
75 & 27 & 7 & 1 & 0 & 0 & 0 & 0 & 0 \\
27 & 7 & 1 & 0 & 0 & 0 & 0 & 0 & 0 \\
7 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 \\
\end{array}
\right)$$

Note that the numbers 1, 7, 27, 75, 165, 297, 429, 429 appear as a row of the Catalan triangle. In particular, proving this proves Vatter's conjecture, since the sum $\sum_q c(1,q,n)$ would then be the sum of a row of the Catalan triangle, and hence a Catalan number.

Federico Ardila found a matroid whose Tutte polynomial $T_n(x,y)$ obeyed $T_n(1,1) = C_n$ and $T_n(x,y) = T_n(y,x)$. I have checked numerically that $T_n(x,y)$ appears to be $\sum c(p,q,n) x^p y^q$. Federico gives an explicit generating function for his polynomial (Theorem 3.6). From this formula, one can check that the coefficient of $x^p y^q$ depends only on $p+q$. I will add that this is a very unusual property for a Tutte polynomial to have, and I would be interested to know any general property of a matroid which implies it.

Federico gives a combinatorial interpretation for the coefficient of $x^p y^q$ in $T_n(x,y)$ in terms of Dyck paths. For him, $q$ is the number of times the Dyck path returns to $0$, which is easily related to the number of children of the root. However, his $p$ is the number of up steps before the first down step. I do not see why this should be the number of crucial vertices.

What's going on here?

Best Answer

It appears that the zeta map which I used to answer Vince Vatter's initial question and which I describe in my answer there, see also page 50 of Jim Haglund's book, indeed solves also this problem:

As discussed in a comment above, the zeta map has the property that it sends

  • the number of $0$'s in an "area sequence" $a = (a_1,\ldots,a_n)$ (item $u$ in Stanley's list, see also Property $D$ in my answer to the other question) to the number of initial north (or up) steps, and
  • the number of $i$'s for which all $i$'s appear before all $i+1$'s within $a$ to the number of returns to $0$ (excluding the very last return).

On the other hand, David's bijection between rooted planar trees and Dyck paths sends

  • the number of children of the root to the number of $0$'s in the corresponding area sequence, and
  • the number of crucial vertices in a tree (excluding the root which is always, except for $n=1$, crucial) to the number of $i$'s for which all $i$'s appear before all $i+1$'s within the corresponding area sequence.

Combining his bijection with the zeta map yields therefore a bijection between rooted planar trees and Dyck paths that sends the bistatistic

(number of crucial vertices, number of children of the root)

to the bistatistic

(number of returns to $0$, number of initial north steps).

To extend David's initial example of the $5$ rooted planar trees of size $4$, the $5$ area sequences corresponding to them (in the ordering $$(1,2), (2,1), (3,1), (2,2), (1,3)$$ as above) are $$010, 011, 012, 001, 000.$$ Applying the zeta map yields the area sequences $$011, 001, 000, 010, 012.$$ For those, the (sequence of the) number of returns to $0$ is $1,2,3,2,1$ and the (sequence of the) number of initial north steps is $2,1,1,2,3$. We thus obtain the same bistatistical sequence.

We thus proved that $\sum c(p,q,n)x^py^q$ is indeed equal to the Tutte polynomial $T_n(x,y)$.