Solved – How to properly develop a machine learning model for a poker game

data mininggamesmachine learningmodeling

I've created an annotation for poker games, similar to chess games. After compiling information from thousands of games, I want to use this large data set for machine learning.

To simplify, let's focus on the last betting round of a Texas Hold'em game. The five cards in the table are visible, and the computer needs to take a decision (fold, check, bet or raise). Of course, each action is dependent on the circumstances of the game (e.g. you can only fold if someone has placed a bet).

So, the decision is somewhat between a classification (fold, check, bet) and a regression (bet \$1,000, bet \$1,500, bet \$2,125, etc). And the result should not be 'correct' or 'incorrect', but a payout over your decision. For instance, if you have bet \$1,500 and the others have folded, then you "win" \$1,500. But if someone has a better game than you, then you lost -1 * your bet, or minus \$1,500.

I need to reinforce that there is not a correct option. In poker, you need to maximize your returns in the long run. The exact same decision in an exact similar situation can have different outcomes if the opponent shows a pair of aces or a pair of threes in the last round — and the computer can't know when to fold and when to bet in this situation, it needs to be consistent in the long run.

So, let's include some requisites:

  • The overall technique 'result' should be the sum of all the individual instance results;
  • It would be good to have conditional decisions, e.g. you check, the opponent raises, and then you need to decide again based on this action;
  • The main goal of this modeling is to implement different machine learning techniques, and use the data set to evaluate them.

Any suggestions of an existing technique which I should base my modeling?

Best Answer

Generally, reinforcement learning is what you're looking for, though most reinforcement learning implementations focus on perfect information games (e.g. pong), not imperfect information games (e.g. poker).

A deep reinforcement learning algorithm has recently achieved super-human play in full-scale Limit Hold'em. Of the machine learning approaches to AI poker, this is state-of-the-art, though it is still ~5BB/100 hands worse than the best domain-specific expert AI (excluding the AI that effectively solved the game).

I suspect we'll see deep reinforcement learning approaches to no-limit start to be competitive with top domain-specific expert AI in the next 5 years.

Related Question