Humans Top AI in Man-v-Machine Heads-Up NL Contest
The final results are in from the two-week-long competition hosted by Carnegie Mellon University which pitted the school’s advanced poker-playing artificial intelligence (AI) program against four of the world’s highest-ranked heads-up no-limit hold’em pros.
Those results? The four human players eked out a narrow but statistically insignificant win, according to the creators of the “Brains Vs. Artificial Intelligence” matchup. The contest was hosted at Pittsburgh’s Rivers Casino; Pittsburgh is also home to Carnegie Mellon U, where the poker-playing “Claudico” program was developed.
Three of the four human pros — Doug Polk, Bjorn Li and Dong Kim — logged individual victories against the Claudico program in their individual 20,000-hand matchups, while Claudico prevailed against a fourth HU NL pro, Jason Les. All told, the human players combined for a collective $732,713 win in play-money chips against the AI program, a little less than 0.5% of the total $170 million in play money that was bet during the contest.
The individual results, each in 20,000-hand, head-to-head matchups:
- Bjorn Li over Claudico by $529,033
- Doug Polk over Claudico by $213,671
- Dong Kim over Claudico by $70,491
- Claudico over Jason Les by $80,482
The Claudico program is the brainchild of Tuomas Sandholm, a professor of computer science at CMU who has directed the development of the program. It’s not the first time such a man-v-machine challenge has been held, though this one did have the twist of being the first time a no-limit program was pitted against top players. Back in 2007, for instance, the Polaris limit hold’em program developed at Canada’s University of Alberta was pitted against well-known pros Ali Eslami and Phil Laak.
Heads-up no-limit poker, by comparison, is significantly more complex than limit, even as HU NL itself remains the simplest form of no-limit poker. The contest’s format was itself a test of game-theory strategies, with none of the psychological warfare that marks live poker in real cash games and tournaments, meaning that despite a solid performance, computer poker-playing programs still aren’t quite up the capabilities of the human brain.
They’re not too far behind, however, and Claudico developer Sandholm was proud of the showing his group’s software turned in. “We knew Claudico was the strongest computer poker program in the world, but we had no idea before this competition how it would fare against four Top 10 poker players,” said Sandholm. “It would have been no shame for Claudico to lose to a set of such talented pros, so even pulling off a statistical tie with them is a tremendous achievement.”
The roughly 0.5% win over 80,000 hands isn’t quite a tie, as HU NL pros exist on narrow edges on the game’s highest levels, but that doesn’t make the computer’s showing poor by any standards. (The players’ collective win fell narrowly inside the margin deemed “statistically significant” over 80,000 hands. That’s somewhat equivalent, from Claudico’s perspective, of being the boxer who’s down for the count but is saved by the bell in the final round.)
Polk, who notched the second-largest individual win against the program, judged Claudico to be “good but not a top-notch player.” Polk also noted that while the program’s hand-playing decisions were solid, the Claudico program at rare moments made bizarre bet-sizing choices. “Betting $19,000 to win a $700 pot just isn’t something that a person would do,” said Polk, in response to one quirky Claudico gambit.
Nor will the exact mechanisms that the Claudico AI used to make such intriguing moves be easy to decipher. The program is a self-teaching construct currently occupying two terabytes of data, making it virtually impregnable to simple strategic analyses. Claudico and, presumably, any successor programs are designed to perform in a game-theory-optimal (GTO) mode, rather than necessarily honing in on the specific strategies of a given opponent.
That’s both a strength and a weakness, but as Sandholm noted, the chance to play against the top HUNL pros gives the Claudico project’s developers 80,000 new hands of live data showing how top players react in certain situations. That can only make the next generation of poker-playing AI software even better. Jason Les, the only one of the four pros that Claudico bested, noted that with another year of develoment, Claudico or its successor might truly match up with the best HUNL pros.
For now, though, it’s back to the drawing board. The human participants received a little something for their work as well, getting appearance fees drawn from $100,000 put up for the contest by its primary sponsors, the hosting Rivers Casino and Intel.