The probability is
$$\frac{47058584898515020667750825872}{174165229296062536531664039375} \approx 27.0195\%$$
which matches the several simulations in your linked forum posts and in Paxinum's answer here.
First, I'll sketch a proof that the optimal strategy is to each time guess any rank (1 to 13) that has been seen the most number of times so far. If this is obvious to you, you can skip to the next section.
Optimal strategy
Let's call each guess you have to make a "round". So the game has at most 52 rounds.
Claim 1. The optimal strategy is to guess, in each round, a rank which maximises the probability of winning that round.
Proof. Essentially, a guess in any round does not constrain future guesses. Whatever you guess, it determines only whether the game ends immediately, and not the future state which is determined by what cards are in the deck and not by what you guess. Formally, suppose you have any strategy $S$ which in some round guesses a rank that is not optimal for that round. Then you can replace it with a strategy that in that round guesses the best rank, and in future rounds does exactly the same as $S$. (For this proof to work, we can modify the game so that we continue to make guesses for all 52 rounds, and only at the end decide that we have lost the game if we ever guessed a card correctly. This obviously doesn't affect the game.) This strategy has a strictly higher probability of winning than $S$; thus $S$ is not optimal.
Claim 2. In each round, the best guess is a rank which has been seen the most number of times.
Proof. Suppose rank $r$ has been seen $s_r$ times so far, and that there are still $D$ cards in the deck. So there are $(4-s_r)$ cards of rank $r$ still in the deck. If you guess $r$, the probability of losing that round, i.e., the probability that the next card that turns up has rank $r$, is $\frac{4-s_r}{D}$, which is minimised when $4-s_r$ is minimum, i.e., $s_r$ is maximum.
Putting them together,
Corollary. The optimal strategy is to guess, in each round, a rank which has been seen the most number of times.
Note that if you have seen multiple ranks the same number of times ($s_{r_1} = s_{r_2}$), it doesn't matter which one you guess. So let's assume wlog that you guess (say) the smallest such rank.
Computation
Now that we have fixed the strategy, how do we calculate the probability that it wins the game? In principle, we could try all permutations of the deck and see for how many it works. In practice, that leaves us with $52! > 10^{67}$ permutations, out of which you can at most sample a few and check (as the other answer and simulations have done), but you cannot possibly enumerate them all. Even if you notice that you don't care about the suit, you still have $52!/4!^{13} > 10^{49}$ orderings, and even if you observe that the ranks are symmetric you still have $52!/(4!^{13} \times 13!) > 10^{40}$ orderings to check.
For something better, notice that the state of the game at any point is captured by the number of times you have seen each rank so far (the $s_r$s above); the order in which they originally appeared is irrelevant now and in the future. Thus we can describe the state of the game by specifying $s_r$ for each $1 \le r \le 13$, and as $0 \le s_r \le 4$, this gives $5^{13} = 1220703125$, much more manageable. For something even better, note that as you don't care about the actual ranks but only their number, you can simply keep how many of the cards have been seen 0 times, how many have been seen 1 time, etc. That is, let $n_k$ denote the number of $r$ for which $s_r = k$. Then the state is $(n_0, n_1, n_2, n_3, n_4)$, and $\sum_{k=0}^{4} n_k = 13$. In fact, for any state in which $n_4 > 0$ you can determine the probability of winning from that state immediately (it is $1$, as you will keep guessing any $r$ for which $s_r = 4$, and never be wrong), so you only need to care about states in which $n_4 = 0$. The number of such states is the number of nonnegative integer solutions to $n_0 + n_1 + n_2 + n_3 = 13$ which is $\binom{16}{3} = 560$, a number small enough that with enough patience one could even handle it by hand.
Let $p_n$ be the probability of winning from state $n$, where the tuple $n = (n_0, n_1, n_2, n_3, n_4)$ is as above. The number of cards seen so far is $\sum_{k=0}^4 {kn_k}$. The number of cards still in the deck, $D$, is 52 minus this number. If $n_4 > 0$, then $p_n = 1$. Else, let $g$ be the largest $k$ for which $n_k > 0$. You will guess a rank counted in $n_g$. The number of ranks with each count, other than the one you guess, is
$$n'_k = \begin{cases}n_k & \text{if $k\neq g$,} \cr n_k - 1 & \text{if $k=g$.}\end{cases}$$
For a particular $k$, for each rank counted in $n_k$, the probability that the next card that turns up is of that rank is $\frac{4-k}{D}$, so the probability that some card of any such rank (other than the one we guess) turns up is $\frac{n'_k(4-k)}{D}$. Once such a card turns up, the next state has one less rank in $n_k$ and one more in $n_{k+1}$. That is, calling the new state $m = m(k)$,
$$m_j = \begin{cases}n_j - 1 & \text{if $j = k$,} \cr n_j + 1 & \text{if $j=k+1$} \cr n_j &\text{otherwise.} \end{cases}$$
So the probability of winning from state $n$ (if $n_4 = 0$) is
$$p_n = \sum_{k=0}^{3} \frac{n'_k(4-k)}{D}p_{m(k)}.$$
(The code, if anyone's interested, is at the end of the second revision of this answer).)
Edit: By Googling for the answer obtained above, I found this paper:
- Kevin Liwack, Oleg Pikhurko, Suporn Pongnumkul (2008), How to Play Dundee. arXiv 0803.0156
They consider this problem and obtain exactly the same answer. They too say "it seems that there is no general closed formula" and find the answer by computer; they seem to be using the algorithm I mentioned in passing above, which for this standard deck considers $5^{13}$ states. They also study the variant where the player has to specify all the sequence of guesses in advance. For some reason they have a relatively long proof that the greedy strategy above is optimal, but I can't find any bug in my proof above.
Best Answer
I believe this is a Bernoulli distribution, and if you were sampling with replacement, you could take 52 Bernoulli trials which is essential producing the binomial distribution. But your sampling without replacement, so this is a hypergeometric distribution: $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ where $N$ is the population of finite items, $M$ items of type 1 and $N-M$ is of type two. This can be generalized to $K$ different types. $n$ is the number of this drawn without replacement and $x$ is the number of things of interest. There are a couple of ways to work your problem, but I'll get you a different problem for inspiration.
The population of jelly beans is $50$. By sampling with replacement, notice each time you're trying to pick a purple jelly bean, the probability is the same $\frac{20}{50}$. So given the binomial distribution for 10 repeated Bernoulli trials, we have $$b(x;n,p)={n\choose x}p^xq^{n-x}$$ So, $P[X=5]=(5;10,\frac{2}{5})={10 \choose 5}\cdot (\frac{2}{5})^5\cdot (1-\frac{2}{5})^5=0.2007$. In summary, choosing exactly five, with replacement, when you're only going to choose 10 times, where each selection has $\frac{2}{5}$ chance of success gives you a probability of 0.2007.
Now the exact same set up but without replacement, this is a hypergeometric distribution. So, $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ Plugging in what we know, we have: $$\frac{{20\choose 5}{30\choose 5}}{50\choose 10} = 0.2151$$ Now let's look at whats going on asymptotically; let let the population go to $1$ billion, keeping the same proportions, but still only choosing 10 times, where you want to find the probability that $5$ are purple jelly beans. You can work it out for yourself, but the sampling with replacement part stays the same... Interesting... Now let's calculate the portion where we were sampling with replacement: $$\frac{{400,000,000\choose 5}{600,000,000\choose 5}}{1,000,000,000\choose 10} = 0.200658125$$ What is interesting here is that as you let the population size grow, sampling without replacement essential converges to sampling with replacement. The percent error before with the population being $50$ was about $6.93%$, but now with the population being $1$ billion, the percent error is about $0.021%$. As $N \rightarrow \infty$, the percent error will converge to $0$.
So using this example, you can see if you run the experiment with different parameters, you're wanting to maximize something. One way would be to set up the experiment, running iteratively, and determine what the max is. I do wonder whether you could simply use calculus on the pdf or cdf curves given the set up.
This is also a simple counting problem though.. Assuming that the deck of cards were bridged shuffled at least seven times to produce a random ordering... You are selecting, at random, from a finite population, without replacement. What counting techniques should apply here? For each time you draw, you know what card was shown, so how does that change the probabilities. How do you maximize this without looking at the distribution itself?
Hopefully this helps.