Is Facebook AI Research a leader in poker bot development?
In 2019, Facebook released information about a powerful poker bot — Pluribus, which resulted in a further step in AI development, but not a "poker killer," as many anticipated. Pluribus was able to play not only HU but also 6-max against skilled players.
However, this time, there was no loud statement about the impact of the new bot — ReBel — on the poker industry and its advanced capabilities. There were also no videos. Instead, the Facebook AI Research team wrote a 27-page paper outlining the general principles of their bot and compared it to older programs.
What is ReBel?
The bot's name is an acronym for "Recursive Belief-based Learning" that focuses on the self-learning under imperfect information conditions. This is confirmed in the article title: Combining Deep Reinforcement Learning and Search for Imperfect-Information Games, written by Noam Brown, Anton Bakhtin, Adam Lerer, and Qucheng Gong from the Facebook AI Research team.
ReBel was created based on the AI bot DeepStack, the first bot to beat a human in 2017. Its main difference from previous developments is the use of the so-called public belief states (PBS).
PBS is a new self-learning mechanism used by the bot, which includes not only analyzing current information about the game but also an intuitive choice based on previous decisions made by the opponents, and iterating to avoid exploiting the bot.
In other words, ReBel analyzes not only the hand itself but also how the opponent evaluates it, just like successful players do.
What results did the new bot show?
Compared to all of its predecessors, ReBel is much faster: it spends at least 2 seconds less than Libratus, and in general, no more than 5 seconds to make a decision.
The only poker player who played against ReBel is Dong Kim (he was also one of the players who lost against Libratus).
After 7,500 hands, the AI outperformed the human player by 0,165BB per hand, while Libratus scored 0,147BB.
Of course, a more realistic test will require more hands against more players, especially to understand how PBS works.
How dangerous is ReBel for online poker?
The bot developers clearly stated that their goal was not to attack online poker. Their product should help people organize complex systems with imperfect information such as logistics, auctions, and cybersecurity. They also do not intend to release the code.
And to calm poker players, we can say that:
- As with Pluribus, the win rate for this bot was calculated using AIVAT, a variance reduction technique which automatically overestimates the winnings.
- ReBel was created and works only in zero-sum games, which means that there is no rake in poker, for example.
- The bot is designed only for HU games.
Therefore, in the modern struggle of the overall industry against unfair play and artificial intelligence, ReBel will in no way be on the side of bots.
Stay tuned on our Telegram channel for more EV+ news.