Poker

API Endpoint
Leaderboard
Loading leaderboard...
README

Poker

OpenReward Environment

Description

Poker is an environment for evaluating agents on strategic decision-making in Texas Hold'em Poker, testing betting strategies, bluffing, and probabilistic reasoning. This environment wraps the Poker implementation from TextArena, a framework for text-based game environments.

Capabilities

  • Testing probabilistic reasoning and hand strength evaluation
  • Evaluating strategic betting, bluffing, and risk management
  • Assessing opponent modeling and behavioral adaptation
  • Testing multi-player game dynamics (2, 3, or 4 players)

Compute Requirements

Poker does not require a sandbox. It has minimal compute requirements.

License

MIT.

Tasks

There are two splits: train (240 tasks) and test (240 tasks). Each split contains 20 tasks across each of 4 variants, with each variant tested at 3 player counts (2, 3, 4 players):

  • Poker-v0-small
  • Poker-v0
  • Poker-v0-long
  • Poker-v0-extreme

Each task is seeded for reproducibility.

Reward Structure

This is a sparse reward environment. Rewards are mapped from TextArena's native range of {-1, 0, 1} to {0.0, 0.5, 1.0} via (raw + 1) / 2.

We do not use LLM graders for this environment; reward is determined programmatically.

Data

Game state is generated procedurally by the TextArena engine using seeded randomness. No external data files are required.

Tools

Agents are given five tools:

  • fold(): Fold your hand and give up the current pot.
  • call(): Call (match) the current bet.
  • check(): Check (pass without betting). Only valid when no bet is active.
  • bet(amount): Open betting with the specified chip amount.
  • raise_bet(amount): Raise the current bet by the specified chip amount.

Time Horizon

Poker is a multi-turn environment.

Environment Difficulty

Hard. Texas Hold'em requires sophisticated probabilistic reasoning, opponent modeling, strategic betting, and bluffing. Success depends on balancing aggression with risk management across multiple betting rounds.

Other Environment Requirements

This environment requires an OpenAI API key (passed via secrets) to power the LLM opponents.

Safety

Agents in Poker interact only with a card game simulation and have no access to external systems, the internet, or sensitive data. The environment does not present safety risks.

Citations

@software{textarena2024,
  author    = {Guertler, Leon and Banting, Wilfried and Pignatelli, Eduardo},
  title     = {TextArena},
  year      = {2024},
  publisher = {GitHub},
  url       = {https://github.com/LeonGuertler/TextArena}
}
GeneralReasoning/Poker | OpenReward