My Attempt at Writing a Poker bot

Joseph Cottingham
7 min readMar 18, 2022

Note that this is an academic project and I have no intention to utilize this project in real cash online poker as that goes against nearly all services terms of conditions.

Over the last couple of days I am attempted to build a poker bot… I say attempted because as you will soon see many problems have stood in my way. This project however is representative of many problems that arise when it comes to approaching problems with deep learning in general.

A full poker bot has many aspects that go into its development, and inline with other complex machine learning problems, it can not be solved via a single model approach and instead must be broken down into sub systems/models that handle specialized tasks. Figure #1 below gives a system wide overview of what the final version of this project will be.

Figure #1

As of the current status of this project the Hand Valuation Model and Game Decision Model are the focus for they make up the most difficult problems. The other aspects of this project remain to be worked on.

Hand Valuation Model

This models purpose is the determine the strength of the hand. The resulting output of this model is a classification of the poker hand type represented by the integers 0–9. These classifications correspond with the following hands in relative order (nothing, one pair, two pair, three of a kind, straight, flush, full house, four of a kind, straight flush, royal flush).

The feature inputs of this model are shown in the table below in Figure #2.

Figure #2

Rank is represented by the integer values 0–13, with 0 representing no card. Suit is represented by the integer values 0–4, with 0 representing no card.

This set of feature inputs is not ideal for my particular game, because there are many different types of poker. I am building this AI to play Texas Holdem which is a flavor of poker where each player is dealt two personal cards, only available for their use and then five community cards are dealt which may be used by all players. This mean that at any time seven potential cards (known or no known) are available to the player. Therefore, having the model trained with 5 cards presents implementation problems that may cause additional miss classification. Unfortunately, I could not find any existing seven card training datasets however, this can be overcome programmatically.

For training and testing the University of California Irvine Poker Hand Data Set was used. This is a datasets contains 25,010 poker hands labeled with the corresponding hand strength.

Figure #3

As you can see in Figure #3 there are exponentially more weak hands in this dataset then strong hands, while this is accurate to the statistical probability of the hands, it artificially increases the accuracy of the model by not training the model for stronger hands. This is most evident in my attempt to use neural Network for classification. My neural network attempt model is shown in Figure #4.

Neural Network (Deep Learning)

Figure 4

This model produced a accuracy of 51% with testing data. Given that this is a numerical classification problem, this is terrible performance. Based on the data visable in Figure #5, I believe this is because the model is over trained with low strength hands and under trained with high strength hands. While I do think this is a resolvable problem by equaling out the number of high strength hands and low strength hands, I do not have the data readily available to do so. Therefore, based on this being numerical classification problem with clear feature signals, I wanted to try and use a classical decision Tree model that would not be so easily weighted by the overwhelming low strength hands.

Figure #5

Decision Tree (Classical Machine Learning)

After a quick test where without much configuration I had a accuracy in the 80%, I double down on running a more through test of the different Decision Tree configurations. In particular I tested a verity of maximum leaf nodes, splitting determines, maximum depths, and minimum samples per split. This included 2,800 separate configurations, with the strongest producing an accuracy of 96.86%. While I am currently satisfied with this accuracy, more testing is required with related tree based models that should bring up the accuracy significantly. Unfortunately, the error seen in this model is potentially related to the error seen in the neural network, as seen in Figure #6.

Figure #6

As can be seen in Figure #6, the model has a bias towards predicting lower labels, and I believe the cause with this model is also a lack of samples with stronger hands.

While this model will function for the time being improvements need to be made. In particular, I want to gather more training data from games which will have a disproportionately strong hands because in most datasets hands are only show if a show down occurs, which requires at least two players to play though the river.

Game Decision Model

Building this model brought with it a problem anyone who has done deep learning work has faced, a lack of datasets. As a result of the financial value of poker game data, it took we a long while to locate any game data that featured professional players with all players hands recorded. Without know the entire tables hands, I was not going to be able to train this model the way it needed to be trained, because I would have not a feature profile for folds and bluffs. I end up using data produced when Pluribus played with professional players. Pluribus is a poker bot developed by Facebook in partnership with Carnegie Mellon University. While I would have liked to use data without Pluribus as a seat, I could not find any. While this is the data I used for training at this point, my preprocessing scripts take Poker Star (a large online poker tournament company) game logs as inputs. This is thanks to Kevin Wang who wrote a conversion script that converts the Pluribus raw game data into the Poker Star game logs.

Preprocessing

There was much preprocessing that had to occur, because I need to setup every turn as a sample. Therefore, in the “Game-Data-Preprocessing” directory there is a script that reads all txt files from it’s “raw-data” and converts the game log data into CSV data that is written to “data-out”. Data out includes the player hand, community cards, percentage of total chips the player has, the percentage of the payers chips they have bet, percentage of total table chips in the pot, the current stage of the game (0–4 iterating up each type community cards are dealt.), the move the player made (0–2, (fold,call,bet)), result (0 for bad outcome, 1 for positive outcome), and player hand ranking (generated with the Hand Valuation Model).

The Pluribus consisted of 9908 games, each with a verity of players. Due to the volume of games that had to be processed to generated the training data, only 300 games worth of data was used. I plan of processing all the data in the future and training off of the result, but due to time restrictions, it was not possible at this time. There 300 games processed, had a total of 2536 different decision moments each recorded as a sample in the training datasets.

Model

The model itself is a simple Sequential design that can be viewed in Figure #7 below.

Figure #7

This model when trained with the full 2571 (10% of samples saved for test data) samples produced an accuracy of 68.11% with testing data. With the following classification break down seen in Figure #8.

Figure #8

The resulting predictions are descent, but there is much testing and improvements that must be implemented. Some areas of interested are increasing the training data size/variety. This should be easy to do given time to run preprocessing on the entire Pluribus game log datasets. Note that based on the average move per game of the 300 processed (8.57 per game), their should be over 84,000 samples for training. Another potential improvement would be shifting from using the integer 0–9 value to represent hand strength to using the corresponding percent chance of receiving that strength. This should give a more differential input to the Decision Model resulting in a more accurate representation of the increasing strength of each step in hand strength.

If you wish to build off of this project the code and instructions to run each script are on Github.

--

--

Joseph Cottingham

Engineer, which a focus on Webapps, Mechanical modeling, and Microcontroller firmware.