Consider the situation from this video. Cards from a normal playing deck are dealt out one at a time while the player guesses what the number of the card is not (suit is ignored). To be clear, the player wants to go through the entire deck, never having their guess match the card. What are the odds that the player will achieve this, assuming they possess a perfect memory and use a perfect strategy? There is some discussion on the show's forums.
Probability – Failing to Guess Each Card in a Deck
card-gamesprobability
Related Solutions
This is for the one player game and a initial hand of five cards. I will consider that you have a hand with $k$ cards if after throwing away all groups of four cards of the same value, you have $k$ cards in your hand. For instance, if your initial hand is $\{2,3,3,3,3\}$ I will consider it a one card hand, not a five card hand.
Let $P(k)$ be the probability that at some moment in the game you have a $k$ card hand. The following values are easy: $P(0)=1$ (at the end of the game you have no cards in your hand) and $P(40)=P(41)=\dots=P(52)=0$ (if you have $40$ cards, there must be a group of four cards of the same value). A little thought gives $$ P(39)=\frac{4^{13}}{\dbinom{52}{13}}=0.000105681 $$
For the rest of the values I have run a simulation in Mathematica of $10^7$ games. These are the results:
k H(k)
0 483
1 25839
2 596131
3 10000000
4 230004
5 10000000
6 10000000
7 10000000
8 10000000
9 10000000
10 10000000
11 10000000
12 10000000
13 10000000
14 10000000
14 9999996
16 9999945
17 9999720
18 9998600
19 9994328
20 9980514
21 9942513
22 9854106
23 9675123
24 9348731
25 8822787
26 8068444
27 7080395
28 5919525
29 4675434
30 3458156
31 2381202
32 1514558
33 876208
34 458693
35 213203
36 84852
37 28160
38 7216
39 1067
For each $k$, $H(k)$ is the number of games in which a hand of $k$ cards has been held before reaching the end of the deck. Observe that the value of $H(39)$ is in accordance with the exact value of $P(39)$. A graph of the results:
It is surprising (at least to me) that for certain values of $k$, like $k=5$, a hand of $k$ cards was held in all $10^7$ games, even if a deck like
1,1,1,1,2,2,2,2,3,3,3,3,...
will give only hands of $1$, $2$, $3$ and $4$ cards.
I believe this is a Bernoulli distribution, and if you were sampling with replacement, you could take 52 Bernoulli trials which is essential producing the binomial distribution. But your sampling without replacement, so this is a hypergeometric distribution: $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ where $N$ is the population of finite items, $M$ items of type 1 and $N-M$ is of type two. This can be generalized to $K$ different types. $n$ is the number of this drawn without replacement and $x$ is the number of things of interest. There are a couple of ways to work your problem, but I'll get you a different problem for inspiration.
Assume a jar contains $30$ green jelly beans and $20$ purple jelly beans. Suppose $10$ jelly beans are selected at random from the jar. Find the probability of obtaining exactly $5$ purple jelly beans if there were selected with replacement. Now, find the probability of obtaining exactly $5$ purple jelly beans if they were selected without replacement. Finally, consider these scenarios asymptotically.
The population of jelly beans is $50$. By sampling with replacement, notice each time you're trying to pick a purple jelly bean, the probability is the same $\frac{20}{50}$. So given the binomial distribution for 10 repeated Bernoulli trials, we have $$b(x;n,p)={n\choose x}p^xq^{n-x}$$ So, $P[X=5]=(5;10,\frac{2}{5})={10 \choose 5}\cdot (\frac{2}{5})^5\cdot (1-\frac{2}{5})^5=0.2007$. In summary, choosing exactly five, with replacement, when you're only going to choose 10 times, where each selection has $\frac{2}{5}$ chance of success gives you a probability of 0.2007.
Now the exact same set up but without replacement, this is a hypergeometric distribution. So, $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ Plugging in what we know, we have: $$\frac{{20\choose 5}{30\choose 5}}{50\choose 10} = 0.2151$$ Now let's look at whats going on asymptotically; let let the population go to $1$ billion, keeping the same proportions, but still only choosing 10 times, where you want to find the probability that $5$ are purple jelly beans. You can work it out for yourself, but the sampling with replacement part stays the same... Interesting... Now let's calculate the portion where we were sampling with replacement: $$\frac{{400,000,000\choose 5}{600,000,000\choose 5}}{1,000,000,000\choose 10} = 0.200658125$$ What is interesting here is that as you let the population size grow, sampling without replacement essential converges to sampling with replacement. The percent error before with the population being $50$ was about $6.93%$, but now with the population being $1$ billion, the percent error is about $0.021%$. As $N \rightarrow \infty$, the percent error will converge to $0$.
So using this example, you can see if you run the experiment with different parameters, you're wanting to maximize something. One way would be to set up the experiment, running iteratively, and determine what the max is. I do wonder whether you could simply use calculus on the pdf or cdf curves given the set up.
This is also a simple counting problem though.. Assuming that the deck of cards were bridged shuffled at least seven times to produce a random ordering... You are selecting, at random, from a finite population, without replacement. What counting techniques should apply here? For each time you draw, you know what card was shown, so how does that change the probabilities. How do you maximize this without looking at the distribution itself?
Hopefully this helps.
Best Answer
The probability is $$\frac{47058584898515020667750825872}{174165229296062536531664039375} \approx 27.0195\%$$ which matches the several simulations in your linked forum posts and in Paxinum's answer here.
First, I'll sketch a proof that the optimal strategy is to each time guess any rank (1 to 13) that has been seen the most number of times so far. If this is obvious to you, you can skip to the next section.
Optimal strategy
Let's call each guess you have to make a "round". So the game has at most 52 rounds.
Claim 1. The optimal strategy is to guess, in each round, a rank which maximises the probability of winning that round.
Proof. Essentially, a guess in any round does not constrain future guesses. Whatever you guess, it determines only whether the game ends immediately, and not the future state which is determined by what cards are in the deck and not by what you guess. Formally, suppose you have any strategy $S$ which in some round guesses a rank that is not optimal for that round. Then you can replace it with a strategy that in that round guesses the best rank, and in future rounds does exactly the same as $S$. (For this proof to work, we can modify the game so that we continue to make guesses for all 52 rounds, and only at the end decide that we have lost the game if we ever guessed a card correctly. This obviously doesn't affect the game.) This strategy has a strictly higher probability of winning than $S$; thus $S$ is not optimal.
Claim 2. In each round, the best guess is a rank which has been seen the most number of times.
Proof. Suppose rank $r$ has been seen $s_r$ times so far, and that there are still $D$ cards in the deck. So there are $(4-s_r)$ cards of rank $r$ still in the deck. If you guess $r$, the probability of losing that round, i.e., the probability that the next card that turns up has rank $r$, is $\frac{4-s_r}{D}$, which is minimised when $4-s_r$ is minimum, i.e., $s_r$ is maximum.
Putting them together,
Corollary. The optimal strategy is to guess, in each round, a rank which has been seen the most number of times.
Note that if you have seen multiple ranks the same number of times ($s_{r_1} = s_{r_2}$), it doesn't matter which one you guess. So let's assume wlog that you guess (say) the smallest such rank.
Computation
Now that we have fixed the strategy, how do we calculate the probability that it wins the game? In principle, we could try all permutations of the deck and see for how many it works. In practice, that leaves us with $52! > 10^{67}$ permutations, out of which you can at most sample a few and check (as the other answer and simulations have done), but you cannot possibly enumerate them all. Even if you notice that you don't care about the suit, you still have $52!/4!^{13} > 10^{49}$ orderings, and even if you observe that the ranks are symmetric you still have $52!/(4!^{13} \times 13!) > 10^{40}$ orderings to check.
For something better, notice that the state of the game at any point is captured by the number of times you have seen each rank so far (the $s_r$s above); the order in which they originally appeared is irrelevant now and in the future. Thus we can describe the state of the game by specifying $s_r$ for each $1 \le r \le 13$, and as $0 \le s_r \le 4$, this gives $5^{13} = 1220703125$, much more manageable. For something even better, note that as you don't care about the actual ranks but only their number, you can simply keep how many of the cards have been seen 0 times, how many have been seen 1 time, etc. That is, let $n_k$ denote the number of $r$ for which $s_r = k$. Then the state is $(n_0, n_1, n_2, n_3, n_4)$, and $\sum_{k=0}^{4} n_k = 13$. In fact, for any state in which $n_4 > 0$ you can determine the probability of winning from that state immediately (it is $1$, as you will keep guessing any $r$ for which $s_r = 4$, and never be wrong), so you only need to care about states in which $n_4 = 0$. The number of such states is the number of nonnegative integer solutions to $n_0 + n_1 + n_2 + n_3 = 13$ which is $\binom{16}{3} = 560$, a number small enough that with enough patience one could even handle it by hand.
Let $p_n$ be the probability of winning from state $n$, where the tuple $n = (n_0, n_1, n_2, n_3, n_4)$ is as above. The number of cards seen so far is $\sum_{k=0}^4 {kn_k}$. The number of cards still in the deck, $D$, is 52 minus this number. If $n_4 > 0$, then $p_n = 1$. Else, let $g$ be the largest $k$ for which $n_k > 0$. You will guess a rank counted in $n_g$. The number of ranks with each count, other than the one you guess, is $$n'_k = \begin{cases}n_k & \text{if $k\neq g$,} \cr n_k - 1 & \text{if $k=g$.}\end{cases}$$ For a particular $k$, for each rank counted in $n_k$, the probability that the next card that turns up is of that rank is $\frac{4-k}{D}$, so the probability that some card of any such rank (other than the one we guess) turns up is $\frac{n'_k(4-k)}{D}$. Once such a card turns up, the next state has one less rank in $n_k$ and one more in $n_{k+1}$. That is, calling the new state $m = m(k)$, $$m_j = \begin{cases}n_j - 1 & \text{if $j = k$,} \cr n_j + 1 & \text{if $j=k+1$} \cr n_j &\text{otherwise.} \end{cases}$$
So the probability of winning from state $n$ (if $n_4 = 0$) is $$p_n = \sum_{k=0}^{3} \frac{n'_k(4-k)}{D}p_{m(k)}.$$
(The code, if anyone's interested, is at the end of the second revision of this answer).)
Edit: By Googling for the answer obtained above, I found this paper:
They consider this problem and obtain exactly the same answer. They too say "it seems that there is no general closed formula" and find the answer by computer; they seem to be using the algorithm I mentioned in passing above, which for this standard deck considers $5^{13}$ states. They also study the variant where the player has to specify all the sequence of guesses in advance. For some reason they have a relatively long proof that the greedy strategy above is optimal, but I can't find any bug in my proof above.