Libratus Poker Strategy

Libratus defeated four heads-up poker specialists in January, and researchers have now revealed how it was done. (Image: CMU) Now, the team of researchers from Carnegie Mellon University that. An artificial intelligence called Libratus beat four top professional poker players in No-Limit Texas Hold’em by breaking the game into smaller, more manageable parts and adjusting its strategy.

Libratus
  • Libratus’s strategy was not programmed in, but rather gener- ated algorithmically. The algorithms are domain-independent and have applicability to a variety of imperfect-information games. Libratus features three main modules, and is powered by new algorithms in each of the three: 1.
  • Three-part attack strategy. Libratus, they claim, was able to see off all comers by using the concept of “Subgame solving”, pictured right, which it used to work around a game of imperfect information such as Hold’em.

Contents

Carnegie Mellon University’s Libratus, an artificial intelligence computer program designed to play poker, started the year by proving it could beat four human poker pros. Now, a pair of university researchers behind the program are ending the year by telling the world exactly how the AI program managed to do it.

Libratus beat pros Jason Les, Dong Kim, Daniel McCauley and Jimmy Chou in a 20-day competition held in January at Rivers Casino in Pittsburgh, Pennsylvania. In fact, it beat each of the players at heads-up no-limit hold’em. Over 120,000 total hands, Libratus managed to end the sessions up more than $1.8 million in chips.

This week, Carnegie Mellon’s professor of computer science Tuomas Sandholm and Ph.D. student Noam Brown published an article in the research journal Science,detailing how it managed to do all that.

According to the article, Libratus was programmed to use a three-pronged approach to the game of poker. Plus, it included more decision points than there are atoms in the universe.

Libratus adjusted on the fly

Poker involves bluffing. So, the researchers said the program was designed to recognize and understand the tactic. It really went deeper than just taking a simple black and white approach to the game.

Sandholm and Brown claim Libratus was able to break poker down into computationally manageable parts. That way it could fix weaknesses in its strategy based on its opponents’ play. Essentially, Libratus did what every good poker player has done for decades: It adjusted to the strategies employed by its opponents on the fly.

Libratus’ three-pronged approach to the game included:

  • Creating an abstract version of the game which was easier to solve
  • Creating a more detailed plan-of-action based on how the game was playing out
  • Improving on that plan in real time by detecting mistakes in its opponent’s strategy and exploiting them

Simply put, Libratus began with a basic strategy designed by looking at a simplified version of the game. That strategy became more complex as it learned how its opponents were playing. Finally, it adjusted the strategy even further to exploit weakness shown by its opponents.

If an opponent were to switch to a different strategy, Libratus also avoided opening itself up to exploitation by detecting potential holes in its own game.

Should bet sizing change, Libratus would add the missing decision branches and compute strategies for them. Then it would add those strategies to its plan going forward.

Libratus demoralizes opponents

Libratus Poker Strategy

After losing in January, Les described playing Libratus as a slightly demoralizing experience:

“Libratus turned out to be way better than we imagined. It’s slightly demoralizing. If you play a human and lose, you can stop, take a break. Here we have to show up to take a beating every day for 11 hours a day. It’s a real different emotional experience when you’re not used to losing that often.”

There may even be further reaching implications of Libratus’ success. Several bot rings employing AI have been discovered on online poker sites, including PokerStars. The success of Libratus could lead to an increase in the prevalence of bots online. However, this specific technology has yet to be tested in full-ring games.

The future of AI

In the end, they built an artificial intelligence computer program that can beat the pros at poker. However, Sandholm and Brown say they are hoping the AI can ultimately do a lot more:

“The techniques that we developed are largely domain independent and can thus be applied to other strategic imperfect-information interactions, including non-recreational applications. Due to the ubiquity of hidden information in real-world strategic interactions, we believe the paradigm introduced in Libratus will be critical to the future growth and widespread application of AI.”

Libratus

The technology behind Libratus has now been licensed to Sandholm’s company Strategic Machine. The company aims to apply strategic reasoning technologies to many different applications.

An artificial intelligence called Libratus beat four top professional poker players in No-Limit Texas Hold’em by breaking the game into smaller, more manageable parts and adjusting its strategy as play progressed during the competition, researchers report.

In a new paper in Science, Tuomas Sandholm, professor of computer science at Carnegie Mellon University, and Noam Brown, a PhD student in the computer science department, detail how their AI achieved superhuman performance in a game with more decision points than atoms in the universe.

Libratus Poker Strategy Rules

AI programs have defeated top humans in checkers, chess, and Go—all challenging games, but ones in which both players know the exact state of the game at all times. Poker players, by contrast, contend with hidden information: what cards their opponents hold and whether an opponent is bluffing.

Imperfect information

In a 20-day competition involving 120,000 hands this past January at Pittsburgh’s Rivers Casino, Libratus became the first AI to defeat top human players at Head’s-Up, No-Limit Texas Hold’em—the primary benchmark and longstanding challenge problem for imperfect-information game-solving by AIs.

Libratus beat each of the players individually in the two-player game and collectively amassed more than $1.8 million in chips. Measured in milli-big blinds per hand (mbb/hand), a standard used by imperfect-information game AI researchers, Libratus decisively defeated the humans by 147 mmb/hand. In poker lingo, this is 14.7 big blinds per game.

“The techniques in Libratus do not use expert domain knowledge or human data and are not specific to poker,” Sandholm and Brown write in the paper. “Thus, they apply to a host of imperfect-information games.”

Such hidden information is ubiquitous in real-world strategic interactions, they note, including business negotiation, cybersecurity, finance, strategic pricing, and military applications.

Three modules

Libratus includes three main modules, the first of which computes an abstraction of the game that is smaller and easier to solve than by considering all 10161 (the number 1 followed by 161 zeroes) possible decision points in the game. It then creates its own detailed strategy for the early rounds of Texas Hold’em and a coarse strategy for the later rounds. This strategy is called the blueprint strategy.

One example of these abstractions in poker is grouping similar hands together and treating them identically.

“Intuitively, there is little difference between a king-high flush and a queen-high flush,” Brown says. “Treating those hands as identical reduces the complexity of the game and, thus, makes it computationally easier.” In the same vein, similar bet sizes also can be grouped together.

In the final rounds of the game, however, a second module constructs a new, finer-grained abstraction based on the state of play. It also computes a strategy for this subgame in real-time that balances strategies across different subgames using the blueprint strategy for guidance—something that needs to be done to achieve safe subgame solving. During the January competition, Libratus performed this computation using the Pittsburgh Supercomputing Center’s Bridges computer.

When an opponent makes a move that is not in the abstraction, the module computes a solution to this subgame that includes the opponent’s move. Sandholm and Brown call this “nested subgame solving.” DeepStack, an AI created by the University of Alberta to play Heads-Up, No-Limit Texas Hold’em, also includes a similar algorithm, called continual re-solving. DeepStack has yet to be tested against top professional players, however.

How artificial intelligence can teach itself slang

The third module is designed to improve the blueprint strategy as competition proceeds. Typically, Sandholm says, AIs use machine learning to find mistakes in the opponent’s strategy and exploit them. But that also opens the AI to exploitation if the opponent shifts strategy. Instead, Libratus’ self-improver module analyzes opponents’ bet sizes to detect potential holes in Libratus’ blueprint strategy. Libratus then adds these missing decision branches, computes strategies for them, and adds them to the blueprint.

AI vs. AI

In addition to beating the human pros, researchers evaluated Libratus against the best prior poker AIs. These included Baby Tartanian8, a bot developed by Sandholm and Brown that won the 2016 Annual Computer Poker Competition held in conjunction with the Association for the Advancement of Artificial Intelligence Annual Conference.

Whereas Baby Tartanian8 beat the next two strongest AIs in the competition by 12 (plus/minus 10) mbb/hand and 24 (plus/minus 20) mbb/hand, Libratus bested Baby Tartanian8 by 63 (plus/minus 28) mbb/hand. DeepStack has not been tested against other AIs, the authors note.

“The techniques that we developed are largely domain independent and can thus be applied to other strategic imperfect-information interactions, including nonrecreational applications,” Sandholm and Brown conclude. “Due to the ubiquity of hidden information in real-world strategic interactions, we believe the paradigm introduced in Libratus will be critical to the future growth and widespread application of AI.”

To spur innovation, teach A.I. to find analogies

The technology has been exclusively licensed to Strategic Machine Inc., a company Sandholm founded to apply strategic reasoning technologies to many different applications.

Libratus Poker Strategy Games

The National Science Foundation and the Army Research Office supported this research.

Source: Carnegie Mellon University