[Math] Permutations of fixed length words of an arbitrary alphabet with fixed number of different letters

combinatoricspermutations

More precise generic formulation of the problem:

Given an alphabet of $A$ different letters, how many $K$ length words can we form that have exactly $D$ different letters?

Example:

Given $\{a,b,c,d,e,f\}$ as the alphabet (for $A = 6$), how many $5$ letter words (for $K = 5$) can we form that have exactly $3$ different letters (for $D = 3$)?

So a couple of such words would be:

$$ abddd, adbdd, abeee, abfff, abbcc, abcbc, afafb, fabbf $$

…etc

Any suggestion to look for the shortest general solution is appreciated.

Best Answer

The Stirling numbers of the second kind $\def\st{\genfrac\{\}0{}}\st nk$ count the number of ways to partition a set of $n$ distinct values into $k$ nonempty subsets. Taking those $n$ objects to be the $n$ positions in a word, and the subsets those that will receive the same letter, we see that with an alphabet of $x$ letters we can make $$ \st nk\,x^{\underline k}\stackrel{\rm def}=\st nk\,x(x-1)\ldots(x-k+1)=\st nk\binom xkk! $$ different words of length $n$ that use exactly $k$ different letters: once our partition into subsets chosen, we have $x$ choices for the letter to attached to the first subset, $x-1$ choices for the second subset, and so forth.

Note that $\sum_{k=0}^n\st nk\,x^{\underline k}=x^n$ is a well known relation, and in our application expresses the fact that each of the $x^n$ words of length$~n$ contributes to the term for exactly one vale of$~k$, its number of different letters.

In terms of the names used in the question, take $n=K$, $k=D$, and $x=A$.

There are many ways to compute the Stirling numbers of the second kind efficiently, but no simple multiplicative formula as for the binomial coefficients. If you need to go up to $n=100$ then you certainly need big integers to store them; to compute them I would personally just use the additive recursion $$ \st{n+1}k=\st n{k-1}+k\st nk $$ which computes them using $O(n^2)$ small multiplications and additions.