Generalizing Catalan numbers: number of ways where we cross the diagonal $k$ times.

binomial-coefficientscatalan-numberscombinatoricssequences-and-series

Let's say we have a square grid with n steps each. One starts at the lower left corner, takes $2n$ steps; $n$ of them to the right and $n$ of them to the upwards and ends up at the upper right corner. If we want to count the number of paths that don't cross the main diagonal and stay on a particular side of it, we get the Catalan numbers, $C_n=\frac{2n \choose n}{(n+1)}$. Accounting for both sides, the total paths that don't cross the main diagonal then become $2 C_n$. A natural question to ask is: how many paths cross the main diagonal exactly $k$ times? Let's call this number $R_{k,n}$. I want to find a closed-form expression for $R_{k,n}$. Obviously, $R_{0,n}=2C_n$


My attempt and some thoughts

The question here: Using the Catalan numbers provides a warm-up. Both @joriki and @robjohn calculate the number of paths that have a segment that is positive (possibly empty) followed by a segment that is negative (possibly empty). Let's denote this sequence, $G_n$ as joriki does. They do this by noting that conditional on some cut-off point, we simply get two Catalan sequences. Hence, the number of such paths becomes the convolution of the Catalan numbers with themselves. joriki notes that this sequence will have a generating function that is the square of the generating function of the Catalan numbers. He uses this to determine that it is simply the $n+1$th Catalan number. Another way to go about finding this would have been to use the general formula here: Proof of identity about generalized binomial sequences. with $k=2$. The two yield the same answer. This can be used to get $R_{1,n}$ per the following equation (we divide $R_{1,n}$ by 2 because the sequence only considers paths which were negative first and then positive while $R_{1,n}$ includes sequences that were positive first):

$$G_n=C_{n+1}=2C_n+\frac{R_{1,n}}{2}$$
$$=> R_{1,n}=2C_{n+1}-4C_n$$

Now, can we apply this "convolution trick" to get $R_{k,n}$?

One way is to consider paths that have three sections. They start off with a section (possibly empty) below the main diagonal. Then, they cross it and there is a section (possibly empty) above the main diagonal. Then, they cross it again and there is a third section (possibly empty) that stays below the main diagonal. Unlike before, there are two cut-off points and it seems we have a three-way convolution of the Catalan numbers with themselves. The first thought is that the number of such paths (say $H_n$) will have a generating function that is the cube of that of the Catalan numbers. And if we increase the number of segments further, we get higher and higher powers of the generating function. But this can't be right since as we keep increasing the number of such segments, the number of paths should keep increasing per equation (5.70) here: Proof of identity about generalized binomial sequences.. In reality, we'll reach an upper bound at some point when we simply cover all ${2n \choose n}$ paths. So, what is the error in the "three way convolution leading to a generating function becoming the cube of the Catalan number generating function" argument? One resolution might be that the argument is fine, but increasing the cut-off points starts double and triple counting the paths.

Best Answer

Instead of allowing possibly empty sections, we split a path at points of crossing the diagonal, and ignore possible touch points (where the path goes from above/below and bounces back). This gives $$R_{0,n}=2C_n,\qquad R_{k,n}=\sum_{m=1}^{n-1}C_m R_{k-1,n-m}\qquad(k,n>0)$$ ($R_{0,n}$ counts "Catalan" $n$-paths above/below the diagonal (not strictly); to get an $(k,n)$-path, we take an $(k-1,n-m)$-path and append a "Catalan" $m$-path which extends the last step). Then, in the notation of your question, $$R_{k}(z):=\sum_{n=1}^{\infty}R_{k,n}z^n=2\big(B_2(z)-1\big)^{k+1}=2z^{k+1}B_2(z)^{2k+2}.$$ Using the identity $(5.70)$ from the question, we get $$R_{k-1}(z)=2z^k\sum_{t=0}^{\infty}\binom{2t+2k}{t}\frac{2k}{2t+2k}z^t\underset{n:=t+k}{\quad=\quad}2\sum_{n=k}^{\infty}\binom{2n}{n-k}\frac{k}{n}z^n,$$ that is, $R_{k-1,n}=\frac{2k}{n}\binom{2n}{n-k}$ for $1\leqslant k\leqslant n$.