[Math] How to find the password space given several restrictions

combinationscombinatoricspermutations

I am trying to determine all valid passwords (the password space) that fulfill this list of requirements.

  1. Password is exactly 15 characters long.
  2. Must contain at least 2 lowercase letters (26 in total).
  3. Must contain at least 2 uppercase letters (26 in total).
  4. Must contain at least 2 numbers 0-9 (10 in total).
  5. Must contain at least 2 special characters (like !, @, # …) (35 in total).

I know that the unrestricted password space would be (26+26+10+35)^15 = 6.33*10^29 so the final answer must be less than that.

My first approach was to find all of the invalid passwords and subtract that from the unrestricted password space. However, even this step proved to be too complicated to for me. I broke down all invalid passwords into a matrix of how they fulfill the 4 "contain" requirements.

requirements fulfillment table

I thought that by finding the number of passwords that fit in each row excluding the first and last row, I'll have all invalid passwords and thus I can subtract that from the unrestricted password space. The first row is all valid passwords which what we are trying to find. The last row is impossible because it is impossible to fill 15 characters without using any lowercase, uppercase, numbers, and/or special characters at least twice. Order does matter, so permutations are used somewhere but I'm not sure how.

Any ideas on how to approach this is a simpler way? Thanks.

Best Answer

The most-bullet-headed formula uses multinomials:

$$\sum_{i+j+k+l=7}\binom{15}{i+2,j+2,k+2,l+2}10^{i+2}26^{j+k+4}35^{l+2}$$ Where $i+2,j+2,k+2,l+2$ counts the the number of digits, lower case, upper case, and special characters, respectively.

The multinomial counts the number of ways to get an ordered partition of the $15$ positions into sets of sizes $i+2,j+2,k+2,l+2,$ and the exponents are the number of words using any such partition.

There are $\binom{10}{3}=120$ tuples $(i,j,k,l).$

We can simplify this formula:

$$\sum_{i+l\leq7}\binom{15}{i+2,l+2,11-i-l}10^{i+2}35^{l+2}26^{11-i-l}\sum_{j=0}^{7-i-l}\binom{11-i-l}{j+2}\\= \sum_{i+l\leq7}\binom{15}{i+2,l+2,11-i-l}10^{i+2}35^{l+2}26^{11-i-l}\left(2^{11-i-l}-2(12-i-l)\right)$$

Because the inner sum is the terms of $(1+1)^{11-i-l}$ minus the first two and last two. This means we have $36$ terms because we have $36$ pairs $(i,l)$ with $i+l\leq7.$

This might be the best way when computing for passwords of length $15.$ But for passwords of length $N$ in general, the number of terms is $\binom{N-7}{2}=O(N^2).$

The other approaches will have a fixed number of terms for all $N.$


Here is a python program that does this simpler computation:

import math

def multinomial(*args):
   n = sum(args)
   if n<0:
       return 0
   value = math.factorial(n)
   for i in args:
       if i<0:
           return 0
       value = value//math.factorial(i)
   return value

value = 0
N=15
for i in range(0,N-7):
    for l in range(0,N-7-i):
       r = N-4-l-i # r = j+k+4 in the original answer
       m = multinomial(i+2,l+2,r)
       e = (10**(i+2))*(35**(l+2))*(26**r)
       value += m*e*((2**r) - 2*(r+1))

print(value)

This returns:

244,746,643,734,876,703,775,955,600,000

or about $2.45\cdot 10^{29}.$


Inclusion/Exclusion.

Let $A$ be all $15$-letter words. Let $B,C,D,E$ all words containing $0$ or $1$ letters, respectively, from lowercase letters, uppercase letters, digits, and special characters.

You want to compute:

$$|A\setminus(B\cup C\cup D\cup E)|$$

This is, by inclusion-exclusion:

$$\begin{align} |A|-&\left(|B|+|C|+|D|+|E|\right)\\ +&\left(|B\cap C|+|B\cap D|+\cdots +|D\cap E|\right)\\ -&\left(|B\cap C\cap\cap D|+\cdots+|C\cap D\cap E|\right)\\ +&|B\cap C\cap D\cap E|\end{align}$$

The last term is zero, since it isn’t possible foe there to be fewer than 2 of each character class and still get $15$ letter words.

If there are $b,c,d,e$ characters of each type, then: $$|A|=(b+c+d+e)^{15}\\|B|=(c+d+e)^{15}+15b(c+d+e)^{14}\\ |C|=(b+d+e)^{15}+15c(b+d+e)^{14}\\ \dots$$ Where we get similar counts for $|D|$ and $|E|.$

The hard part are the other terms. I don’t see a good way to do them other than checking all four cases for the two-term cases and all eight cases for the three-term cases.

So:

$$|B\cap C|=(d+e)^{15}+15(b+c)(d+e)^{14}+15\cdot14bc(d+e)^{13}$$ and similarly for the other pairs.

$$|B\cap C\cap D|=e^{15}+15(b+c+d)e^{14}+15\cdot 14(bc+bd+cd)e^{13}\\+15\cdot 14\cdot 13bcd e^{12}$$ and similarly for the other triples.

Not pleasant, but it works.


Generating Functions

The generating function technique was a good approach. Sadly, the person who posted it deleted it rather than correct the a wrong first pass.

You want the coefficient of $\frac{x^{15}}{15!}$ in $$(e^{26x}-26x-1)^2(e^{10x}-10x-1)(e^{35x}-35x-1)$$

Expanding this gives in the general case $81$ terms, but since two of the values, $26,$ are the same, you get fewer distinct terms. Each term of the form $ax^ne^{mx}$ and contributing a term of $$a\frac{15!}{(15-n)!}m^{15-n}.$$

Some inspection shows this is the same as the inclusion-exclusion value.

It’s still a fair amount of work, but work a computer can do.

For example, the terms with $m=36$ have expansions: $2(1+35x)(1+26x)=2+122x+1820x^2,$ yielding terms $$2\cdot 36^{15}+122\cdot 15\cdot 36^{14}+1820\cdot15\cdot14\cdot 36^{13}$$

More generally, if we had $k$ disjoint character types and $b_i$ characters in type $i,$ and want a password of $N$ letters, then, for each $S\subseteq \{1,2,\dots,k\},$ define $b_{S}=\sum_{i\in S} b_i.$ Then the terms in the expansion are:

$$(-1)^{k-|S|}e^{b_Sx}\prod_{i\notin S}(1+b_ix)$$ has $k-|S|+1$ terms.

This means that in general, the product will have $$\sum_{i=0}^k (k-i +1)\binom{k}i=(k+1)2^k-k2^{k-1}=(k+2)2^{k-1}$$ terms. That’s 48 terms in the general case, but fewer when values $b_S$ are not distinct, as in the original question. Wolfram Alpha gives $36$ terms in the original.

Define $p_S=\prod_{i\in S} b_i.$

You can write the exact formula as:

$$\sum_{S}(-1)^{k-|S|}\sum_{T\subseteq S^c}p_T \frac{N!}{(N-|T|)!}b_{S}^{N-|T|}$$

You can combine terms where $|T|=i$ and it becomes:

$$T_N=\sum_{S}(-1)^{k-|S|}b_S^{N}\sum_{i=0}^{k-|S|}\frac{N!}{(N-i)!b_S^{i}}\sum_{|T|=i,T\cap S=\emptyset}p_T.$$

(Here, you need $b_{\emptyset}^0=0^0=1.$)

Of course: $$\sum_{|T|=i, T\cap S=\emptyset} p_T$$ is just the elementary symmetric polynomial of degree $i$ on the $k-|S|$ tuple $(b_r)_{r\notin S}.$

If $s_i(x_1,\dots)$ is the elementary symmetric polynomial of degree $i$ then this becomes: $$T_N=\sum_{S}(-1)^{k-|S|}b_S^{N}\sum_{i=0}^{k-|S|}\frac{N!}{(N-i)!b_S^{i}}s_i(b_r)_{r\notin S}.\tag 1$$

You see the symmetric polynomials in the inclusion-exclusion section clearly, in the computation for $|B\cap C\cap D|.$


It's not really surprising that symmetric polynomials come into it, since $T_N$ is definitely symmetric in $b_1,\dots,b_k$ since any re-ordering of the $b_i$ should give the same count. But it would be interesting to see the total expression expressed for each $N$ as a polynomial on the $s_i=s_{i}(b_i)_{i=1}^k$ for $i=0,\dots,k.$

We see that $T_N(b_1,\dots,b_k)$ is a homogeneous polynomial of degree $N$ for each $N.$ So it can be written as:

$$\sum_{i_1+2i_2+\dots+ki_k=N} c_{i_1,i_2,\dots,i_k}s_1^{i_1}s_2^{i_2}\cdots s_k^{i_k}$$

For some $c_{*}.$


The coefficient of $b_S^N$ in $(1)$ is a polynomial of degree $k-|S|$ in $N.$ So we get that $T_N$ satisfies a “simple” linear recurrence. But the recurrence is of degree $(k+2)2^{k-1},$ so it won’t be useful here for $k=4, N=15.$

But if you want to compute a lot of consecutive $T_N,$ the recurrence might be the easiest formula.

The degree of the recurrence will be smaller with repeated $b_i$ or repeated $b_S$ values more generally. In the case of the original problem, there are $1,3,4,3,1$ distinct values $B_S$ for $|S|=0,1,2,3,4$ respectively, and the recurrence is of degree $1\cdot 1+2\cdot 3+3\cdot 4+4\cdot 3+5\cdot 1=33,$ down from $48$ in the general case $k=4.$ Also, $b_{\emptyset}=0,$ so it mostly isn't needed, except when $N=0.$

And of course, $T_0=T_1=\cdots = T_7=0.$ So you need to compute $T_{8}$ through $T_{32}$ and then you get a recurrence of degree $N$ for computing higher $N.$

For each $b=b_S$ for some $S,$ let $f(b)$ be the size of the smallest $S$ which has $b_S.$ Let $B$ be the set of distinct non-zero values $b_S.$

Then we expand $$p(x)=\prod_{b\in B}(x-b)^{k+1-f(b)}=\sum_{k=0}^{D} a_ix^i.$$ We get $a_{D}=1,$ and the recurrence: $$T_{n+D}=\sum_{i=0}^{D-1} a_{i}T_{n+i},$$ when $N>0.$ In our case: $$T_{n+32}=-\sum_{i=0}^{31} a_iT_{n+i}$$ when $n>0.$

Still not practical for finding $N=15.$

But we can get the symmetric polynomial formula for $T_n$ this way. For the general case. This is because we can express the coefficients of $\prod_{S\neq\emptyset} (x-b_S)^{k+1-|S|}$ in terms of the symmetric polynomials in the general case. That is messy, but possible.

For $k=4,$

$$\begin{align}T_8/s_4^2&=2520\\ T_9/s_4^2 &= 7560s_1\\ T_{10}/s_4^2 &= 18900s_1^2-12600s_2\\ T_{11}/s_4^2&=\binom{11}{5,2,2,2}\sum b_i^3+\binom{11}{4,3,2,2}\sum_{i\neq j}b_i^2b_j+\binom{11}{3,3,3,2}s_3\end{align}$$ Th terms in $T_{11}$ are already getting messy.

I definitely don't see a good general result for computing the $c_*.$