Libratus — The AI that Defeated Humans in the Game of Poker

  • Significant Amount of Mathematical Jargon: An entire section of the paper is completely occupied by definitions of mathematic symbols. Once the mathematic definitions ended, the rest of the paper is a lengthy proof of a theorem. The first time you read it, the thesis feels like an endless fog of mathematics.
  • Absence of a Popular Subject: Currently, the two hot-topics in the field of machine learning are deep learning and reinforcement learning. However, the main topics of this paper are Game Theory and operations research.
  • Methodology Versatility. Some readers are under the impression that one can only apply Libratus’ algorithm to Texas hold’em, or at best could one can extend it to other board games. Readers feel that the algorithm has limited application prospects.

Understanding Libratus Algorithm Methodology

In terms of methodology, the Libratus algorithm is a combination of both Game Theory and operations research. Markov decision making process and dynamic planning comprise the theoretical foundation of reinforcement learning. Even though the sources are different, the two will eventually converge.

Four Key Concepts of Libratus Thesis

Understanding the Libratus paper requires you to understand four key concepts.

  • Nash equilibrium strategy
  • Counterfactual best response
  • Abstraction
  • End game

Nash Equilibrium Strategy: Do Not Give Your Opponent Any Easy Wins

Libratus is currently only capable of playing two-person poker. However, future improvements to the algorithm and the addition of more GPUs could allow it to play against multiple opponents at the same time.
When playing Texas hold’em, each player is first dealt a hand of two cards, and then the poker dealer reveals five more cards in succession. There are four rounds of betting, after which the players compare their hands to see who wins.

How does Libratus Determine its Opponent’s Betting Strategy?

Actually, Libratus does not speculate on its opponent’s betting strategy.
To be a bit more specific, Libratus’ only goal is to reach the Nash Equilibrium Strategy. Nash was a famous mathematician and the subject of the film “A Beautiful Mind.” The goal of Nash Equilibrium Strategy is to avoid giving your opponent any easy wins.

How Can You Find the Most Optimal Betting Strategy?

Since the Nash Equilibrium Strategy does not attempt to determine the opponent’s strategy, we can use recursive machine Game Theory to analyze every possible hand combination and find the best possible betting strategy.

  • Column 1, our hand (pre-flop) has (5252)/ (21) = 1,326 possible combinations.
  • Column 2, the first three community cards, have (504948)/ (321) = 19600 possible combinations. Then comes the next two community cards (turn and river). Added together, there are a total of 19600 + 19600 47 + 19600 47 * 46 = 43,316,000 possible combinations.

Counterfactual Best Response: A Popular Algorithm for Finding the Nash Equilibrium Strategy

Counterfactual Best Response (CBR) is an algorithm to find the Nash Equilibrium Strategy which has recently enjoyed quite a bit of popularity.
There are three main elements to Counterfactual Best Response:

  • Simulation
  • Regret
  • Iteration
  • We begin by finding all the relevant rows in our updated betting column according to the values in the first four columns that represent the present situation.
  • In our updated table, the sixth column is the betting strategy, while the seventh column consists of average profit for each strategy. We then progress towards selecting a betting strategy according to average profit. The higher the average profit, the higher the probability we choose the corresponding betting strategy.

Simplify the Game Situation to Minimize the Required Computing Power

As we mentioned in the previous section, the two columns alone in the betting table contain a maximum of 57,437,016,000 different combinations of values. Adding every possible combination of betting history will result in 10165 possible value combinations

Endgame, Off-tree Problems, and Re-solving

Simplifying the game situation can also lead to miscalculations. For example, we could mistakenly equate a pair of 7s to a pair of 8s. Be that as it may, if both parties keep betting and neither party folds, their cards will inevitably be shown, at which point the pair of 7s will lose to the pair of 8s, even though the difference between them is small. This is what we call an off-tree problem.

Conclusion

The introduction of an AI system like Libratus is one of many ways through which AI is changing the way we understand human intelligence. I hope this article helped you understand how Libratus functions and it could beat its competitors. I am sure with time we will witness similar programs for other games as well.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Cloud

Alibaba Cloud

Follow me to keep abreast with the latest technology news, industry insights, and developer trends. Alibaba Cloud website:https://www.alibabacloud.com