[Math] (Fast way to) Get a combination given its position in (reverse-)lexicographic order

combinationscombinatoricscomputational mathematics

This question is the inverse of the Fast way to get a position of combination (without repetitions).

Given all $\left(\!\!\binom{n}{k}\!\!\right)$ combinations without repetitions (in either lexicographic or reverse-lexicographic order), what would be the most efficient way to translate/map/lookup a
position of the combination to the combination ($k$-tuple) itself?

I need this to be fast for combinations of $\left(\!\!\binom{70}{7}\!\!\right)$ order of magnitude – very large, but not exceeding 2 billion (fits into int32 maximum value).

Below is an example of $\left(\!\!\binom{6}{3}\!\!\right)$, where the aim is to quickly translate value 9 to tuple (a, d, f), while in the real problem $5 <= k <= 8$.

$$\begin{array}
{cccccc|c}
a&b&c&d&e&f&^{combination}/_{sort\_order}&
\\\hline
x&x&x& & & &1\\
x&x& &x& & &2\\
x&x& & &x& &3\\
x&x& & & &x&4\\
x& &x&x& & &5\\
x& &x& &x& &6\\
x& &x& & &x&7\\
x& & &x&x& &8\\
x& & &x& &x&9\\
x& & & &x&x&10\\
.&.&.&.&.&.&.\\
& & &x&x&x&20\\
\end{array}$$

I know that I could pre-calculate all the combinations and reverse the lookup dictionary. However, such dictionary would not be efficient in terms of memory usage. Therefore I am looking for either calculation-based approach, or a more efficient data structure to perform this mapping, and be able to process about 1 million such lookups during data processing job.

Note: the position can be 0-based.

Best Answer

To convert a lexicographic position to a combination:

Like the reverse problem discussed in Fast way to get a position of combination (without repetitions), a lexicographic position can be converted to a combination using combinatorial digit place values.

Let us define the tuple $(a,b,c)$ as a binary number $[0 0 0 1 1 1]$ and assign it the lexicographic order ZERO and henceforth treat each combination as a binary number.

Next we define the Least Significant Bit (LSB) and the Most Significant Bit (MSB). To be consistent with ordinary binary number representations, let us define the leftmost position as MSB and the rightmost position as LSB. Because we are picking three objects out of six, each corresponding binary tuple would have three ones and three zeros. Ones represent the objects selected and zeros represent the objects not selected.

Now define the place value to each digit. In ordinary binary numbers, digit place values start at LSB and go to MSB taking values $\{1,2,4,\cdots,2^n,\cdots \}$. Here $n$ is used as the position of the digit from LSB. Likewise, combinatorial place value is defined as $\binom{n}{r}$, where $n$ is the position of the digit from the LSB, following the binary convention. The parameter $r$ is the number of ones to the right of the digit, including its own position.

For example, $n=0$ for LSB and $n=5$ for MSB.

$r=3$ for leftmost one.

$r=1$ for rightmost one.

$r=2$ for the middle one.

Conversion From Lexicographic Position to Binary Tuple:

To convert a lexicographic position $L\_number$ to its corresponding combination, $L\_number$ is compared against the place value of the digit. The comparison starts at MSB. If the place value is less than $L\_number$, the corresponding binary number is set to one and $L\_number$ is decremented by the place value.

If place value $\ge L\_number$

  • Place ONE in that position
  • Update $L\_number = L\_number - place value$
  • Decrement $r$ in $\binom{n}{r}$
  • Compare $L\_number$ to place value at next position to right $(n = n - 1)$

If place value $< L\_number$

  • Move to next position $(n = n - 1)$

$\textit{Example:}$

Find the combination of three objects from six at the lexicographic place $9$.

$L_n = 9$,

Compare: $\{ \{ L_n = 9\} \geq \binom{5}{3} = 10 \} $, Result: $FALSE$, Combination: $[ 0 . . . . . ]$, $r = 3$, $L_n = 9$

Compare: $\{\{ L_n = 9\}\geq\binom{4}{3} = 4\}$, Result: $TRUE$, Combination: $[ 0 1 . . . . ]$, $r = 3-1 = 2$, $L_n = L_n - 4 = 9-4=5$

Compare: $\{ \{ L_n = 5\}\geq\binom{3}{2} = 3 \} $, Result: $TRUE$, Combination: $[ 0 1 1 . . . ]$, $r = 2-1 = 1$, $L_n = L_n - 3 = 5-3=2$

Compare: $\{\{ L_n = 2\}\geq\binom{2}{1} = 2 \} $, Result: $TRUE$, Combination: $[ 0 1 1 1 . . ]$, $r = 1-1 = 0$, $L_n = L_n - 2 = 2-2=0$

Compare: $\{ \{ L_n = 0\}\geq\binom{1}{0} = 1 \} $, Result: $FALSE$, Combination: $[ 0 1 1 1 0 . ]$, $r = 0$, $L_n = 0$,

Compare: $\{ \{ L_n = 0\}\geq\binom{0}{0} = 1 \} $, Result: $FALSE$, Combination: $[ 0 1 1 1 0 0 ]$, $r = 1-1 = 0$, $L_n = L_n - 2 = 2-2=0$

Since the final answer is $[0 1 1 1 0 0]$, the lexicographic order 9 corresponds to combination $(c,d,e)$.

The following function returns an array of binary values, given the size of the objects $n$, the number of objects to be picked $r$ and the lexicographic order $m$.

function [out]=encode2ts(n,r,m)
%Encodes the input integer 'm' to a constant weight code of n-digits with r-ones
%Most significant digit at highest index.

out = zeros(1,n);
while (n>0)
    if (n>r & r>=0)
        y = nchoosek(n-1,r);
    else
        y = 0;
    end

    if (m>=y)
        m = m - y;
        out(n) = 1;
        r = r - 1;
    else
        out(n) = 0;
    end
    n = n - 1;
 end