Probability Theory – Analyzing a Coding Theory and Probability Puzzle

coding-theoryprobabilitypuzzle

I thought of the following problem and I am stuck in solving it.

Suppose there is a deck of 4 cards with 2 red and 2 blue. I pick 2 cards at random and choose 1 and show the other to my friend. With what probability can my friend find the color of my card if we have agreed on a good strategy in advance?

If my friend always guesses the color red, we only fail in this game if I get 2 blue cards. That happens with probability 1/4 and it is the best I can achieve in this case.

However, if the cards are numbered we can do even better, i.e. there is red card 1 and red card 2, blue card 1 and blue card 2. In this case, we can agree that my friend chooses the color of the card I gave him if the number is 1 and flips the color if it is 2. You can check that the only way we can fail is if I get both the red and the blue card of 1. (If I get both cards of the same color I show him the number 1 card. If I get two cards of different colors I show him the number 2 card and we win.) Since the probability of drawing two cards with the number 1 is 1/6, having numbers on the cards clearly helps.

My question is what happens when there are $N$ cards of $N$ colors ($N^2$ in total) and I draw $N$ cards, choose 1 and reveal $N-1$ cards to my friend (in a random order, i.e. the order cannot encode information). What is the strategy that maximizes our probability of winning? How does this differ if cards are numbered or if they are not?

I am interested both in optimal strategies for small $N$, or with asymptotic bounds for large $N$ in both the numbered and unnumbered cases. Could it be that the probability in the numbered case goes to 1 and the unnumbered case is small? This would be very unintuitive!

Best Answer

This answer does not give an answer to the general question for large $N$, but at least shows a method by which you can compute the optimal strategy for any specific $N$. This is achieved by formulating the problem as finding a specific matching on a bipartite graph.

Let each node on one side, lets say $A$, represent the combination of cards you can draw and the nodes on the other side, $B$, represent the possible choices your friend can see. An edge in this graph will connect $a \in A$ with $b \in B$ iff $a$ can become $b$ by removing a card. A good strategy for this game would consist of a matching on the graph. Why? Because when you get a combination of cards, you have to have made a choice of which card to remove or which node to pair it up with. The matching also tells your friend what to guess: he should just guess the combination that his combination is paired up with.

But what matching corresponds to the optimal strategy? Well, we want to maximize the number of times we'll get a combination in the matching so if $M$ is the set of all matched nodes in $A$, and $w(a)$ is the chance of getting $a$, we want to maximize $$ \sum_{a \in M} w(a) $$

This corresponds to a maximum vertex-weighted matching problem which is a special case of the Assignment Problem. I made a program to solve this problem in C (source can be found here). The optimal strategies will have a success rate equal to the above sum. This will be a fraction where the denominator will be the number of combinations you can get, $N^2 \choose N$ (see A014062). The numerator does not seem to follow any clear pattern. I've calculated the numerator for some small values of $N$ with the program. These are tabulated below.

\begin{array}{ |r|r|r|r| } \hline N & \text{numerator} & \text{denominator} & \text{success rate} \\ \hline 2 & 5 & 6 & 0.833 \\ 3 & 72 & 84 & 0.857 \\ 4 & 1580 & 1820 & 0.868 \\ 5 & 45250 & 53130 & 0.852 \\ 6 & 1705347 & 1947792 & 0.876 \\ 7 & 73502156 & 85900584 & 0.856 \\ 8 & 3828899328 & 4426165368 & 0.865 \\ 9 & 225664027650 & 260887834350 & 0.865 \\ 10 & 14914814271500 & 17310309456440 & 0.862 \\ 11 & 1114114517120572 & 1276749965026536 & 0.873 \\ 12 & 89759340344067264 & 103619293824707388 & 0.866 \\ 13 & 7983517393715262672 & 9176358300744339432 & 0.870 \\ \hline \end{array}

Because of the exponential nature of this problem, calculating the success rate for larger $N$ would be pretty hard – at least with this approach.