[Math] Probability 2 player card game with multiple patterns to win, who has the advantage

card-gamesgame theoryprobability

$2$ people (C and D) decide to play a computer assisted game. The computer is programmed to quickly play as many fair hands (using the equivalent of a fair $52$ card deck), until someone wins the hand so the win will appear instantly when they press a button on the keyboard and then it waits for the next keypress for the next game. The computer is necessary because of how infrequent wins are, thus a real deck of cards would take way too long between wins.

The rules are player C can win if he gets $3$ "gapped" $3$ card straights dealt in order. An A (ace) will be considered the highest rank for both C and D so $A23~567~9TJ$ (T = ten rank) would NOT be a win for C but $234~QKA~789$ would be. There has to be at least a gap of $1$ rank between the $3$ card straights but can be wider as shown in the 2nd example. Note that the individual straights of length $3$ have to be dealt in order but the actual $3$ length $3$ straights don't need to be in relative order (as shown by $234QKA789$). That is still a win for C and need not be $234789QKA$ (although that would also be a win for C). The simplest example of a win for C would be something like $234678TJQ$. This is also the lowest ranked win for C. The highest rank win for C would be $45689TQKA$. Remember permutations are allowed such as $89TQKA456$ is also a win but $243678TJQ$ is not. If you think of each gapped $3$ card straight as a letter. Such as with $234678TJQ$, X=$234$, Y=$678$, Z=$TJQ$, then valid permutations are XYZ, XZY, YXZ, YZX, ZXY, and ZYX.

For player D to win, $2$ pattern classes are allowed. Either any $6$ card straight (such as $345678$) or this pattern: $223344$, $334455$, $556677$… $QQKKAA$.

The computer will keep dealing community (shared) cards until either there is a winner of the hand (not likely) or all $52$ cards are depleted (very likely). The cards are reshuffled after each win and after each nonwin (all $52$ cards are depleted). Permutations are NOT allowed. The straights and patterns MUST be dealt in order. For example, $345678$ is a win for D but $357468$ is not, even though they could be the same exact cards. Suits are irrelevant for the straights and other winning patterns. Any suits are allowed.

So it is a rainy day and they play many hundreds of winning hands to see who gets more wins. The question is who has the mathematical advantage and by how much? A hand that ties can be ignored as they will not count that hand and will continue play. Neither player has any knowledge of the expected probability and both just play on a "hunch" based on the observed patterns to win which they are told about.

The $300$ bounty ends Monday night (Aug 22nd) around 11:59PM (Eastern time) already including the $24$ hour grace period. Other simulations are encouraged but also math solutions or partial math solutions (like how to set it up properly).

Sample data of $3$ million random shuffles can be downloaded here for test purposes and/or analysis for mathematical solutions.:

https://drive.google.com/file/d/0BweDAVsuCEM1amhsNmFITnEwd2s/view

Format is $52$ bytes of data per line and $2$ more bytes for the end of line so $54$ bytes total per line, $3$ million lines. "$2$" = rank $2$, "$3$" = rank $3$… "T" = rank $10$…. All cards are represented by this set of ranks: "$23456789TJQKA$". There are $4$ of each rank in each shuffle (suits are irrelevant in this card game).

Where are the mathematical attempts at solving the probabilities of this card game? How about even some partial solutions like take the winning D patterns and compute the probability of a D win if D was the only player, subtracting out the overcounting cases where $2$ (or more) of those patterns can appear in the same shuffled deck. For example, if we get something like $234567223344$

Best Answer

My simulation program works by first shuffling the entire $52$ card deck and then searching the entire deck for candidate wins for each player, then simply selecting the candidate win with the lowest drawn card position. Normally there is only $1$ candidate win per winning hand as can be seen by only $769$ candidate ties. A candidate tie is when the program detects a possible win for both players, which are normally at different final drawn card positions, but very rarely will be the same. In $22$ cases they were the same, thus resulting in a tie. It was easier for me to let the program search the entire $52$ card deck then to just draw $1$ card at a time looking for winners after each card (after the $6$th card is drawn which is the minimum # of cards needed for a player D win).

I have some results from a simulation program. Also it is interesting that since the game is so close to $50/50$, as the simulation runs, the partial results show C sometimes winning. This is why $1$ million hands is not enough.

$1,000,000,000$ : simulated hands.
$24$ hours : approx runtime using interpreted language.
$11,500$ : approx # of hands played per second.
$3.06$ Ghz dual core Intel : CPU speed but only $50$% used by simulation.
$152,981$ : wins for player C.
$157,822$ : wins for player D.
$49.22$% : $50.78$% ... approx ratio of C wins to D wins.
$99.9689$% : approx percentage of nonwinning hands (deck exhausted without a winner).
$41.1453$ : approx # of average cards drawn for C win (only counting when C actually wins).
$28.9136$ : approx # of average cards drawn for D win (only counting when D actually wins).
$769$ : candidate ties (both players have possible win in entire shuffled deck).
$22$ : actual ties.
$35:1$ approx candidate tie to actual tie ratio.
$6537$ : approx average hands per C win.
$6336$ : approx average hands per D win.
$3217$ : approx average hands for any win (or tie).
$45,454,545$ : approx average hands for a tie.

I am hoping someone else can do some analysis too to help confirm these numbers. Perhaps some mathematical calculations to help show that player D has a slight advantage. Also if someone would like some other stats tracked in the simulation program just let me know in the comment section and I can try to put it in and then post those results.

Here are the patterns we need to check for a possible C win. Note each length $3$ straight can appear in any order so for example, $234678TJQ$ and $678234TJQ$ are equally good winners for player C. Those are just $2$ of the $6$ possible permutations of each of these $10$ main patterns. So there are $ 60$ actual patterns including permutations. The comma indicates any number of cards (including $0$ cards) may appear as long as the pattern continues. For example, in pattern $1$, if the drawn cards are $K234J7ATJQ2495678$, that is a winner for player C cuz it follows (a permutation of) a winning pattern.

$~~1)~234,678,TJQ$
$~~2)~234,678,JQK$
$~~3)~234,678,QKA$
$~~4)~234,789,JQK$
$~~5)~234,789,QKA$
$~~6)~234,89T,QKA$
$~~7)~345,789,JQK$
$~~8)~345,789,QKA$
$~~9)~345,89T,QKA$
$10)~456,89T,QKA$

The winning patterns for D are (no permutations are allowed):

$~~1)~234567$
$~~2)~345678$
$~~3)~456789$
$~~4)~56789T$
$~~5)~6789TJ$
$~~6)~789TJQ$
$~~7)~89TJQK$
$~~8)~9TJQKA$
$~~9)~223344$
$10)~334455$
$11)~445566$
$12)~556677$
$13)~667788$
$14)~778899$
$15)~8899TT$
$16)~99TTJJ$
$17)~TTJJQQ$
$18)~JJQQKK$
$19)~QQKKAA$

Note that for player D, NO permutations of these $19$ patterns are allowed. They must appear EXACTLY as shown here. For example, $QQJJKK$ is NOT a win for player D. This is a clarification not a rule change.

It is somewhat amazing to me that wins are rare for both C and D (based on the average number of hands between wins), but they are almost equally as rare, making it almost $50/50~$ like a fair coin toss.

In case anyone is interested, I had the program display the cards for all $22$ ties. It displays the entire deck in the shuffled order for that hand so you can just trace where the tie occurred. For example, in the first tie, the relevant cards are $,345,89T,9TJQKA$. Here is a cropped partial screenshot for the simulation output for the ties: (T= ten).

enter image description here

I will now attempt to solve this mathematically but in a very simple way.

Looking at player's D's ways to win, there are only $2$ patterns. Pattern $1$ which is a $6$ card straight has probability $32/52 * 4/51 * 4/50 * 4/49 * 4/48 * 4/47$ but this is the probability of getting the $6$ card straight on the first $6$ cards drawn so we multiply that by $47$, assuming the straight can appear equally likely anywhere in the deck. So far that is about $1 / 9518$.

Next we have player D's 2nd pattern such as ($223344,334455, ... ,QQKKAA$) so that is $44/52 * 3/51 * 4/50 * 3/49 * 4/48 * 3/47$ but again we have to multiply that by $47$ since that $6$ card pattern can appear anywhere in the deck so we get $1/16407$. Adding the 2 probabilities we get $1/6024$ which is fairly close to the probability reported by the simulation program which was $1/6336$. The slight difference could be a combination of interaction with C's wins, ties, and the patterns not being uniformly distributed in all $47$ positions, but considering the simplicity of the calculations, it is quite a good approximation, only roughly $5$% off. For example, imagine we get $234567$ very late in the deck. It is quite possible that $234$, $345$, or $456$ can give C the win instead, or $567$ can give C and D a tie. Thus we would expect D's win chances to be lower than $1/6024$ which means the denominator has to be larger, more like $1/6336$. Also we must consider cases where multiple $6$ card straights appear. For example, it maybe possible that $234567$ appears early in the deck but something like $6789TJ$ appears much later (towards the end of the deck). Using my simple math, I would overcount these. I think worst case there could be $8$ straights of length $6$ in the same deck but that would be an astronomically rare event. Perhaps something like $234567,~234567,~234567,~234567,~89TJQK,~89TJQK,~89TJQK,~9TJQKA$. In my simulation program, I could count up how many times a straight of length $6$ appears multiple times in the same shuffled deck and add that to the table of results.

So I have done some of the work for you. I just need someone to somehow compute the interactions and account for those in the final probability. I will attempt to calculate C's win probability, also using simple math if possible.

Now let's look at C's win pattern. Let's take $234,678,TJQ$ as an example. Any one of those $3$ card patterns can appear first. Let's suppose the $234$ appeared first. The first part of this pattern MUST appear by card draw $46$ to allow ample time for the other part of the pattern to appear. To get $234$ on the first $3$ card draw we have probability $4/52 * 4/51 * 4/50$ = $1/2072$. From here it gets more complicated. Placing the three $3$ card straights might be one of those stars and bars type problems. Just find the probability of getting the $234,678,TJQ$ in order (without completing a $6$ card straight) and then multiply by $6$ to account for the possible permutations of them. Then multiply by $10$ to account for the $9$ other similar win patterns for C. This should be a rough approximation. Treat the $234$ as one entity and just place it anywhere in the $52$ card deck and do the same for the $678$ and $TJQ$. That is stars and bars. I think one way to solve this is to use stars and bars with a gap space of at least $2$ (such as $2,3,4,i,j,6,7,8$ so we are guaranteed not to have a $6$ card straight, then handle the cases where the gap space is $1$ separately so that we can make sure we only place cards there that will not complete the $6$ card straight. Also if anyone needs a certain count of some pattern that will help them to use math to solve this just let me know and I can try to create a bucket for it in my sim prog. I would love to get an exact answer to this problem but based on the lack and answers here (even with a $150$ bounty), I am assuming it is not easy.

Related Question