Solved – How to write a poker player using Bayes networks

bayesianbayesian networkinferencemachine learningsampling

This is my first question on stackexchange and also my first time implementing a Bayesian network so I will apologize ahead of time for any novice mistakes I make.

The goal of my project is to implement a Poker player that does Bayesian inference. There has been some work done on this by a group at Monash University in Australia led by Kevin Korb which I am using to reference off of. You can find their work here and here. The first reference, being a book, is more helpful and detailed (see Ch.5.5 & 11 for Poker). I am using the simplified version of Texas Hold'em called Leduc Hold'em to start.

Leduc Hold’em is a two player poker game. The deck used in Leduc Hold’em contains six cards, two jacks, two queens and two kings, and is shuffled prior to playing a hand. At the beginning of a hand, each player pays a one chip ante to the pot and receives one private card. A round of betting then takes place starting with player one. After the round of betting, a single public card is revealed from the deck, which both players use to construct their hand. This card is called the flop. Another round of betting occurs after the flop, again starting with player one, and then a showdown takes place. At a showdown, if either player has paired their private card with the public card they win all the chips in the pot. In the event neither player pairs, the player with the higher card is declared the winner. The players split the money in the pot if they have the same private card.

Each betting round follows the same format. The first player to act has the option to check or bet. When betting the player adds chips into the pot and action moves to the other player. When a player faces a bet, they have the option to fold, call or raise. When folding, a player forfeits the hand and all the money in the pot is awarded to the opposing player. When calling, a player places enough chips into the pot to match the bet faced and the betting round is concluded. When raising, the player must put more chips into the pot than the current bet faced and action moves to the opposing player. If the first player checks initially, the second player may check to conclude the betting round or bet. In Leduc Hold’em there is a limit of one bet and one raise per round. The bets and raises are of a fixed size. This size is two chips in the first betting round and four chips in the second.

If you look at pg.185, Section 5.5.2.1 in Figure 5.14 there is a diagram for a Bayes Net for Poker.

Bayesian network for a poker player

This is essentially the same one I am using for my project, but granted that there are no up-cards in Leduc Hold'em the two corresponding nodes for up-cards are not applicable for it. I was able to compute the conditional porbability tables between the node pairs (BPP_Win, BPP_Fin, OPP_Fin), (OPP_Fin, BPP_Fin), (OPP_Fin, OPP_Curr), and (BPP_Fin,BPP_Curr) but I am not sure how to compute the conditional probability of (OPP_Curr, OPP_Action).

Best Answer

From the book you mention:

Note that the existing structure makes the assumption that the opponent's action depends only on its current hand.

And a little bit further:

There are four action probability tables $P_i(OPP\_Action|OPP\_Current)$, corresponding to the four rounds of betting. These report the conditional probabilities per round of the actions — folding, passing/calling or betting/raising — given the opponent's current hand type. BPP adjusts these probabilities over time, using the relative frequency of these behaviors per opponent. Since the rules of poker do not allow the observation of hidden cards unless the hand is held to showdown, these counts are made only for such hands, undoubtedly introducing some bias.

If I understood, you do not have four such tables, but the logic is the same. You start with a prior belief about player's action. It simply has to be something that a reasonable player would do (raise with high probability if they have a high pair, fold with high probability if they have a very poor game etc).

When you get to the showdown, you can reconstitute how the opponent played the game at each step, so you update the probability of the observed actions given the observed hand with Bayes rule.

Related Question