Guessing cards with no replacement and perfect memory

probability

Suppose we were playing a game where a player has to guess the top card (number only, suit is not considered) of a standard, 52 card deck of playing cards. As the player guesses, he will remove the top card from the deck and flip it over to check the accuracy of his guess; he will not put cards back into the deck as the game progresses. In other words, the guesser can see all the cards that have been turned over.

The player's goal is to maximize his number of correct guesses.
Intuitively, a player should guess cards that have been seen 0 times until none remain, then cards that have been seen 1 time until none remain, then cards that have been seen 2 times until none remain, and lastly cards that have been seen 3 times until none remain.

What would the expected number of correct guesses be across the deck of 52 cards, on average?

Somewhat similar problems:
Failing to guess each card in a deck

Odds of guessing suit from a deck of cards, with perfect memory

Best Answer

I believe this is a Bernoulli distribution, and if you were sampling with replacement, you could take 52 Bernoulli trials which is essential producing the binomial distribution. But your sampling without replacement, so this is a hypergeometric distribution: $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ where $N$ is the population of finite items, $M$ items of type 1 and $N-M$ is of type two. This can be generalized to $K$ different types. $n$ is the number of this drawn without replacement and $x$ is the number of things of interest. There are a couple of ways to work your problem, but I'll get you a different problem for inspiration.

Assume a jar contains $30$ green jelly beans and $20$ purple jelly beans. Suppose $10$ jelly beans are selected at random from the jar. Find the probability of obtaining exactly $5$ purple jelly beans if there were selected with replacement. Now, find the probability of obtaining exactly $5$ purple jelly beans if they were selected without replacement. Finally, consider these scenarios asymptotically.

The population of jelly beans is $50$. By sampling with replacement, notice each time you're trying to pick a purple jelly bean, the probability is the same $\frac{20}{50}$. So given the binomial distribution for 10 repeated Bernoulli trials, we have $$b(x;n,p)={n\choose x}p^xq^{n-x}$$ So, $P[X=5]=(5;10,\frac{2}{5})={10 \choose 5}\cdot (\frac{2}{5})^5\cdot (1-\frac{2}{5})^5=0.2007$. In summary, choosing exactly five, with replacement, when you're only going to choose 10 times, where each selection has $\frac{2}{5}$ chance of success gives you a probability of 0.2007.

Now the exact same set up but without replacement, this is a hypergeometric distribution. So, $$h(x;n,M,N)=\frac{{M\choose x}{N-M\choose n-x}}{{N\choose n}} $$ Plugging in what we know, we have: $$\frac{{20\choose 5}{30\choose 5}}{50\choose 10} = 0.2151$$ Now let's look at whats going on asymptotically; let let the population go to $1$ billion, keeping the same proportions, but still only choosing 10 times, where you want to find the probability that $5$ are purple jelly beans. You can work it out for yourself, but the sampling with replacement part stays the same... Interesting... Now let's calculate the portion where we were sampling with replacement: $$\frac{{400,000,000\choose 5}{600,000,000\choose 5}}{1,000,000,000\choose 10} = 0.200658125$$ What is interesting here is that as you let the population size grow, sampling without replacement essential converges to sampling with replacement. The percent error before with the population being $50$ was about $6.93%$, but now with the population being $1$ billion, the percent error is about $0.021%$. As $N \rightarrow \infty$, the percent error will converge to $0$.

So using this example, you can see if you run the experiment with different parameters, you're wanting to maximize something. One way would be to set up the experiment, running iteratively, and determine what the max is. I do wonder whether you could simply use calculus on the pdf or cdf curves given the set up.

This is also a simple counting problem though.. Assuming that the deck of cards were bridged shuffled at least seven times to produce a random ordering... You are selecting, at random, from a finite population, without replacement. What counting techniques should apply here? For each time you draw, you know what card was shown, so how does that change the probabilities. How do you maximize this without looking at the distribution itself?

Hopefully this helps.

Related Question