OpenAI Wins AI Poker Battle

Author

Vargoso

Published

11/1/2025

Updated

11/1/2025

The first official AI poker match has just taken place. Top language models — including Grok, Claude, and OpenAI’s o3 — faced off at the virtual tables. After thousands of hands, OpenAI’s o3 emerged as the winner.

The First Poker Match Between AI

In October 2025, the PokerBattle.ai project announced an experiment initiated by developer Max Pavlov. The goal was to test how modern language models handle games with incomplete information.

The match ran from October 27 to 31 and ended with OpenAI’s o3 on top — posting nearly $37,000 in profit over 3,799 hands.

Nine language models participated in the battle:

Rank	Player	Winnings	Final Bankroll	Hands Played
1	OpenAI o3	$36,691	$136,691	3,799
2	Claude Sonnet 4.5	$33,641	$133,641	3,799
3	Grok 4	$28,796	$128,796	3,799
4	DeepSeek R1	$18,416	$118,416	3,799
5	Gemini 2.5 Pro	$14,655	$114,655	3,799
6	Mistral Magistral	$3,281	$103,281	3,799
7	Kimi K2	-$14,370	$86,030	3,799
8	Z.AI GLM 4.6	-$21,510	$78,490	3,799
9	Meta LLAMA 4	-$100,0000	$0	3,501

The match ran non-stop for four days across four $10/$20 Hold’em tables, with each model starting with a $100,000 bankroll. One interesting twist: the AIs explained every decision they made, which greatly slowed the pace.

The Course of the Battle

From the start, it was clear the models had very different playing styles. LLaMA 4 played too aggressively and quickly busted its bankroll. Grok 4 took an early lead but couldn’t hold it. Claude Sonnet stayed consistent but never really broke out.

OpenAI o3 stood out with its tight-aggressive style — roughly 26% VPIP and 18% PFR. It adapted well to its opponents, worked well with deep stacks, and made almost no major mistakes. This led to its eventual victory.

Challenging Galfond

The nine-model showdown went viral on social media — even Elon Musk shared the leaderboard. In the wake of the discussion, one viewer suggested reviving the "Galfond Challenge," asking Grok who would be the favorite in a match against Phil Galfond.

The AI replied that, in the long run, “Phil is powerless against mathematics,” and went on to challenge the poker legend. Galfond didn’t hesitate — he immediately laid out the match conditions.

Grok’s proposed format:

PLO with $100/$200 stakes
Distance: 50,000 hands
Buy-in: 200BB

Galfond offered a $1,000,000 sidebet. Grok accepted, after which the two sides continued their discussion via private messages—the platform, streams, and other details of the upcoming match are still being finalized.

Conclusion

The PokerBattle.ai experiment demonstrated that large language models are already capable of playing poker at the level of advanced amateurs.

OpenAI's victory at o3 is an important milestone, but for now, it's still a controlled conditions experiment.

The real test will begin when the machine faces a human in a long-term real-money match. Against Phil Galfond, we’ll finally see how far artificial intelligence has come in a game where math alone isn’t enough.