Fast method to convert Catalan path to lexicographic position.

binomial-coefficientscatalan-numberscombinatorics

I was looking through the questions: Fast way to get a position of combination (without repetitions) and (Fast way to) Get a combination given its position in (reverse-)lexicographic order for converting a combination to its lexicographic position and then going from the position to the combination. A logical extension of this is the generalized Catalan numbers which are basically combinations with an additional constraint.

Consider a gambler tossing a coin and winning 1\$ on heads and losing 1\$ on tails. We know he got $l$ tails and $l+k$ heads. The number of ways he could have done this is: ${k+2l \choose l}$ and if we wanted to generate the actual paths, we can do so using the approaches in the questions linked above (the binary representations of the combinations would be the tosses at which he got tails).

But now, we add the further constraint that he reaches $k\$$ for the first time after $k+2l$ tosses. The post: Probability that random walk will reach state $k$ for the first time on step $n$ shows that the number of ways to do this becomes: ${k+2l-1 \choose l} – {k+2l-1 \choose l-1}$. Now, I want to loop through these paths. So, just as the first two posts above show how to go from combinations to lexicographic position and back, is there an efficient way to do this for these paths as well (that doesn't involve storing all paths in memory). For this question, let's consider we're given a path in binary. Something like: [0,1,0,0,0,1,1] (1's are tosses where he got tails) and want to get an integer which is the lexicographic position if you list all the valid paths (the total number of which is the generalized Catalan number).

Best Answer

Let $N(k,l)=\binom{k+2l-1}{l}-\binom{k+2l-1}{l-1}$ denote the number of paths. Note that $$ N(k,l) = N(k-1,l) + N(k+1,l-1) \qquad (k>0) $$ with the base cases $N(0,0)=1$, $N(0,l)=0$ for $l>0$. This comes from conditioning on whether the first step is a heads or tails. Furthermore, with the encoding Heads = 1, Tails = 0, all of the paths counted by $N(k+1,l-1)$ which begin with tails will occur lexicographically before those counted by $N(k-1,l)$. Therefore, to find the lexicographic position of a path, you can do as follows:

Maintain a variable accum, initially set to $0$.
Loop through the entries of the path from left to right.
- Whenever the current entry of the path is heads, add the value $N(k' + 1, l' - 1)$ to accum, where $l'$ is the number of tails including and after the current entry, and $k'+l'$ is the number of heads including and after the current entry.
After the loop, accum will equal the desired lexicographic index.

Conversely, to convert from a lexicographic index to a path, you can use the following algorithm:

Maintain a variable accum, initially set to the given lexicographic index.
Maintain two variable $k'$ and $l'$, initially set to the given $k$ and $l$. As the path is specified, these will represent the values of $k$ and $l$ for the unassigned part of the list.
Maintain a list of length $k+2l$, whose entries are initially unassigned (this will be the output path).
For each entry of the path, do the following:
- Check to see if $N(k' + 1, l' -1)$ is less than or equal to accum. If it is, then the current entry of the path should be set to heads. Furthermore, you should subtract $N(k' + 1,l' -1)$ from accum. Finally, subtract one from $k'$.
- Otherwise, the current entry of the path should be set to tails, and accum is unchanged. Then, add one to $k'$, and subtract one from $l'$.

If your only purpose is to efficiently loop through all these paths in lexicographic order, then converting back and forth between paths and lexicographic indices is not necessary. All you need is a method that, given a path, finds the lexicographically next path. By calling this method repeatedly, you will loop through all paths.

Here is the idea of such a method. Given a path as a sequence of H and T, the next path is given by…

Finding an instance of TH (meaning tails immediately followed by heads), and replacing it with HT.
- If there are multiple instances of TH, choose the rightmost one for which the switch TH -> HT still leaves a legal path. The switch is legal if and only if the accumulated wealth up to but not including TH is at most $k-2$.
Rearranging all of the flips to the right of that instance of TH so all tails occur before heads.

For example, suppose that $k=4$ and $l=3$, and the current path is

T H H T H H H T H H

There are three instances of TH. The accumulated wealth before each of them, going left to right, is $0$, $1$, and $3$. The ones where it would leave a legal path if you switch are $0$ and $1$; the last $3$ is too much, since $k-2=2$. Therefore, we switch the latest one. All of the heads after that are moved as far to the right as possible. Here is the next path, where the heads and tails which were switched are marked with ^.

T H H H T T H H H H
      ^ ^

Related Solutions

[Math] How to know if its permutation or combination in a specific exercise

"Permutation" vs. "Combination" is not about "choosing" or "not choosing". It's about whether the order matters or not.

And there is also a difference between how you count and what you are counting.

Here, you are correct: the order "matters" in that you are looking at ordered sequences of six tosses, and you want to count in how many of these ordered sequence there are exactly 2 heads and 4 tails. (If the order of the tosses did not matter, then the answer would be "One: 2 heads and 4 tails").

But, how do you count them? The simplest way of doing this is to imagine you have six blank spots,

    __  __  __  __  __  __

and will place either an H or a T in each spot. You have two Hs and four Ts to put down, and you want to find out in how many ways you can do it. Once you put the two Hs down, the Ts will be forced, so you just need to find in how many ways you can put the two Hs. It doesn't matter which H you put down first. The only thing that matters is where the two Hs end up, not which one went first. So to count this we will use combinations (because order does not matter). And so you need to find the number of ways in which you can choose two of the six spots, which is where the Hs will go. Combinations, because the order in which the Hs are placed does not matter: deciding the first and then the third tosses will be head amounts to the same thing as deciding that the third and the first tosses will be heads.

[Math] Fast way to get a position of combination (without repetitions)

Let us denote your tuple [a b c] as [1 1 1 0 0 0] and so on.

Define $\binom{n}{r}=0$ for $n<r$

For your tuple: $[a d f] = [1 0 0 1 0 1]$ $$P = 1\cdot \binom{0}{1}+0\cdot \binom{1}{1}+0\cdot \binom{2}{1}+1\cdot \binom{3}{2}+0\cdot\binom{4}{2}+1\cdot\binom{5}{3} + 1$$ $$P=0 + 0 +0 +3+0+10+0+1 = 14$$

General Algorithm:

Calculate the position value of each binary digit using $\binom{n}{r}$
Take $n$ as position of the digit from left, for leftmost digit $n=0$.
Write $r$ for each position as the number of 'ONES' counted from left, including the one at current position.

Example-1: [a b c] = [1 1 1 0 0 0]

Calculate the position of the tuple as sum: $$P = 1\cdot \binom{0}{1}+1\cdot \binom{1}{2}+1\cdot \binom{2}{3}+0\cdot \binom{3}{3}+0\cdot\binom{4}{3}+0\cdot\binom{5}{3} + 1$$ $$P=0 + 0 +0 +0+0+0+0+1 = 1$$

Example-2: [d e f] = [0 0 0 1 1 1] $$P = 0\cdot \binom{0}{0}+0\cdot \binom{1}{0}+0\cdot \binom{2}{0}+1\cdot \binom{3}{1}+1\cdot\binom{4}{2}+1\cdot\binom{5}{3} + 1$$ $$S=0+0+0+3+6+10+1=20$$

The lone ONE is added because you are not starting at zero.

Best Answer

Related Solutions

[Math] How to know if its permutation or combination in a specific exercise

[Math] Fast way to get a position of combination (without repetitions)

Related Question